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Cross-Reference to Related Applications 

This application claims the benefit of the filing date of U.S. application 
Serial No. 60/444,094, filed January 31, 2003, and U.S. application Serial No. 
60/474,659, filed May 30, 2003, under 35 U.S.C § 1 19(e), and incorporates 
those applications by reference herein. 

Field of the Invention 

This invention relates to the field of biochemical assays and reagents. 
More specifically, this invention relates to mutant proteins covalently linked 
(tethered) to one or more functional groups and to methods for their use. 



Background of the Invention 

The specific detection of molecules is a keystone in understanding the 
role of that molecule in the cell. Labels, e.g., those that are covalently linked to 
a molecule of interest, permit the ready detection of that molecule in a complex 

20 mixture. The label may be one that is added by chemical synthesis in vitro or 
attached in vivo, e.g., via recombinant techniques. For instance, the attachment 
of fluorescent or other labels onto proteins has traditionally been accomplished 
by in vitro chemical modification after protein purification (Hermanson, 1996). 
For in vivo attachment of a label, green fluorescent protein (GFP) from the 

25 jellyfish Aequorea victoria can be genetically fused with many host proteins to 
produce fluorescent chimeras in situ (Tsien, 1998; Chalfie et al., 1998). 
However, while GFP-based indicators are currently employed in a variety of 
assays, e.g., measuring pH (Kneen et al., 1998; Llopis et al, 1998; Miesenbock 
et al., 1998), Ca 2+ (Miyawaki et al., 1997; Rosomer et al., 1997), and membrane 

30 potential (Siegel et al., 1997), the fluorescence of intrinsically labeled proteins 
such as GFP is limited by the properties of protein structure, e.g., a limited range 
of fluorescent colors and relatively low intrinsic brightness (Cubitt et al., 1995; 
OrmoetaL, 1996). 
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To address the deficiencies of GPP labeling in situ, Griffen et al. (1998) 
synthesized a tight-binding pair of molecular components: a small receptor 
domain composed of as few as six natural amino acids and a small (< 700 
dalton), synthetic ligand that could be linked to various spectroscopic probes or 
5 crosslinks. The receptor domain included four cysteines at the /, i + 1 , i + 4, and 
i + 5 positions of an a helix and the ligand was 4',5'-bis(l,3,2-dithioarsolan-2- 
yl)fluorescein (FLASH). Griffen et al. disclose that the ligand had relatively few 
binding sites in nontransfected mammalian cells, was membrane-permeant and 
was nonfluorescent until it bound with high affinity and specificity to a 
1 0 tetracysteine domain in a recombinant protein, resulting in cells being 

fluorescently labeled ("FLASH" labeled) with a nanomolar or lower dissociation 
constant. However, with respect to background binding in cells, Stroffekova et 
al. (2001) disclose that FLASH-EDT 2 binds non-specifically to endogenous 
cysteine-rich proteins. Furthermore, labeling proteins by FLASH is limited by 
15 the range of fluorophores that may be used. 

Receptor-mediated targeting methods use genetically encoded targeting 
sequences to localize fluorophores to virtually any cellular site, provided that the 
targeted protein is able to fold properly. For example, Farinas et al. (1999) 
disclose that cDNA transfection was used to target a single-chain antibody (sFv) 
20 to a specified site in a cell. Farinas et al. disclose that conjugates of a hapten (4- 
ethoxymethylene-2-phenyl-2-oxazolin-5-one, phOx) and a fluorescent probe 
(e.g., BODIPY Fl, tetramethylrhodamine, and fluorescein) were bound with high 
affinity (about 5 nM) to the subcellular site for the sFv in living Chinese hamster 
ovary cells, indicating that the targeted antibody functioned as a high affinity 
25 receptor for the cell-permeable hapten-fluorophore conjugates. Nevertheless, 
functional sFv expression may be relatively poor in reducing environments. 

Thus, what is needed is an improved method to label a desired protein. 

Summary of the Invention 
The invention provides methods, compositions and kits for tethering 
30 (linking), e.g., via a covalent or otherwise stable bond, one or more functional 
groups to a protein of the invention or to a fusion protein (chimera) which 
includes a protein of the invention. A protein of the invention is structurally 
related to a wild-type (native) hydrolase but comprises at least one amino acid 
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substitution relative to the corresponding wild-type hydrolase and binds a 
substrate of the corresponding wild-type hydrolase but lacks or has reduced 
catalytic activity relative to the corresponding wild-type hydrolase (which 
mutant protein is referred to herein as a mutant hydrolase). The aforementioned 
5 tethering occurs, for instance, in solution or suspension, in a cell, on a solid 
support or at solution/surface interfaces, by employing a substrate for a 
hydrolase which includes a reactive group and which has been modified to 
include one or more functional groups. As used herein, a "substrate" includes a 
substrate having a reactive group and optionally one or more functional groups. 

10 A substrate which includes one or more functional groups is generally referred to 
herein as a substrate of the invention. As used herein, a "functional group" is a 
molecule which is detectable or is capable of detection (e.g., a chromophore, 
fluorophore or luminophore), or can be bound or attached to a second molecule 
(e.g., biotin, hapten, or a cross-linking group) or includes one or more amino 

1 5 acids, e.g., a peptide or polypeptide including an antibody or receptor, one or 
more nucleotides, lipids including lipid bilayers, a solid support, e.g., a 
sedimental particle, and the like. A functional group may have more than one 
property such as being capable of detection and being bound to another 
molecule. As used herein a "reactive group" is the minimum number of atoms in 

20 a substrate which are specifically recognized by a particular wild-type or mutant 
hydrolase of the invention. The interaction of a reactive group in a substrate and 
a wild-type hydrolase results in a product and the regeneration of the wild-type 
hydrolase. A substrate, e.g., a substrate of the invention, may also optionally 
include a linker, e.g., a cleavable linker. 

25 A substrate useful in the invention is one which is specifically bound by 

a mutant hydrolase, and preferably results in a bond formed with an amino acid, 
e.g., the reactive residue, of the mutant hydrolase which bond is more stable than 
the bond formed between the substrate and the corresponding amino acid of the 
wild-type hydrolase. While the mutant hydrolase specifically binds substrates 

30 which may be specifically bound by the corresponding wild-type hydrolase, no 
product or substantially less product, e.g., 2-, 10-, 100-, or 1000-fold less, is 
formed from the interaction between the mutant hydrolase and the substrate 
under conditions which result in product formation by a reaction between the 

3 
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corresponding wild-type hydrolase and substrate. The lack of, or reduced 
amounts of, product formation by the mutant hydrolase is due to at least one 
substitution in the mutant hydrolase, which substitution results in the mutant 
hydrolase forming a bond with the substrate which is more stable than the bond 
5 formed between the corresponding wild-type hydrolase and the substrate. 

Preferably, the bond formed between a mutant hydrolase and a substrate of the 
invention has a half-life (i.e., t^) that is at least 2-fold, and more preferably at 
least 4- or even 10-fold, and up to 1Q0-, 1000- or 10,000-fold, greater than the tj* 
of the bond formed between a corresponding wild-type hydrolase and the 
1 0 substrate under conditions which result in product formation by the 

corresponding wild-type hydrolase. Preferably, the bond formed between the 
mutant hydrolase and the substrate has a ty 2 of at least 30 minutes and preferably 
at least 4 hours, and up to at least 10 hours, and is resistant to disruption by 
washing, protein denaturants, and/or high temperatures, e.g., the bond is stable to 
15 boiling in SDS. 

In one embodiment, the substrate is a substrate for a dehalogenase, e.g., a 
haloalkane dehalogenase or a dehalogenase that cleaves carbon-halogen bonds in 
an aliphatic or aromatic halogenated substrate, such as a substrate for 
Rhodococcus, Staphylococcus, Pseudomonas, Burkholderia.Agrobacterium or 
20 Xanthobacter dehalogenase, or a substrate for a serine beta-lactamase. In one 
embodiment, a substrate of the invention optionally includes a linker which 
physically separates one or more functional groups from the reactive group in 
the substrate. For instance, for some mutant hydrolases, i.e., those with deep 
catalytic pockets, a substrate of the invention can include a linker of sufficient 
25 length and structure so that the one or more functional groups of the substrate of 
the invention do not disturb the 3-D structure of the hydrolase (wild-type or 
mutant). For example, one example of a substrate of the invention for a 
dehalogenase includes a reactive group such as (CH 2 >2-3X where X is a halide 
and a functional group such as tetramethylrhodamine (TAMRA), e.g., TAMRA- 
30 C14H24O4-CI. 

In one embodiment, a linker is preferably 12 to 30 atoms in length. The 
linker may not always be present in a substrate of the invention, however, in 
some embodiments, the physical separation of the reactive group and the 

4 
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functional group may be needed so that the reactive group can interact with the 
reactive residue in the mutant hydrolase to form a covalent bond. Preferably, 
when present, the linker does not substantially alter, e.g., impair, the specificity 
or reactivity of a substrate having the linker with the wild-type or mutant 
5 hydrolase relative to the specificity or reactivity of a corresponding substrate 
which lacks the linker with the wild-type or mutant hydrolase. Further, the 
presence of the linker preferably does not substantially alter, e.g., impair, one or 
more properties, e.g., the function, of the functional group. 

Thus, the invention provides a compound of formula (I): R-linker-A-X, 

1 0 wherein R is one or more functional groups, wherein the linker is a multiatom 
straight or branched chain including C, N, S, or O, wherein A-X is a substrate for 
a dehalogenase, and wherein X is a halogen. In one embodiment, an alkylhalide 
is covalently attached to a linker, L, which is a group or groups that covalently 
attach one or more functional groups to form a substrate for a dehalogenase. As 

1 5 described herein, a mutant dehalogenase, DhaA.H272F, was bound to substrates 
for DhaA which included 5-(and 6-) carboxy fluorescein (FAM), e.g., FAM- 
C14H24O4-CI, TAMRA, e.g., TAMRA-C14H24O4-CI, andbiotin, e.g., biotin- 
C18H32O4-CI, and there was no significant quenching effect of this binding on 
FAM or TAMRA fluorescence or on biotin binding to streptavidin. As also 

20 described herein, a mutant dehalogenase, e.g., DhaA.D106C and DhaA.D106E 
as well as DhaA.D106C:H272F and DhaA.D106E:H272F, bound FAM- 
C14H24O4-CI and/or TAMRA-C14H24O4-CI. In one embodiment, the substrate is 
R-(CH 2 )20(CH2)20(CH2)20(CH2)6C1, wherein R is a functional group. To 
prepare such a substrate, a functional group may be reacted with a molecule such 

25 as NH(CH 2 )20(CH2)20(CH2)20(CH2)6C1. 

In one embodiment, substrates of the invention are permeable to the 
plasma membranes of cells. For instance, as described herein the plasma 
membranes of prokaryotic (£. coli) and eukaryotic (CHO-K1) cells were 
permeable to TAMRA-C14H24O4-CI and biotin-QsHbCVCl and, these 

30 substrates were rapidly and efficiently loaded into and washed out of cells in the 
absence of a mutant hydrolase. In the presence of a mutant hydrolase, at least a 
portion of the substrate was prevented from being washed out of the cells. Thus, 
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the bound portion of the substrate can serve as a marker or as a means to capture 
the mutant hydrolase or a fusion thereof. 

The invention further provides methods for preparing a substrate for a * 
hydrolase which substrate is modified to include one or more functional groups. 
5 Exemplary functional groups for use in the invention include, but are not limited 
to, an amino acid, protein, e.g., enzyme, antibody or other immunogenic protein, 
a radionuclide, a nucleic acid molecule, a drug, a lipid, biotin, avidin, 
streptavidin, a magnetic bead, a solid support, an electron opaque molecule, 
chromophore, MRI contrast agent, a dye, e.g., a xarithene dye, a calcium 
1 0 sensitive dye, e.g., l-[2-amino-5-(2,7-dicMoro-6-hydroxy^ 
phenoxy]-2-(2^amino-5'-methylphenoxy)^^ 

(Fluo-3), a sodium sensitive dye, e.g., 1,3-benzenedicarboxylic acid, 4,4 f - 
[l,4,10,134etraoxa-7,16-diazacyclooctadecane-7,16-diylbis(5-methoxy-6,2- 
benzofiirandiyl)]bis (PBFI), a NO sensitive dye, e.g., 4-amino-5-methylamino- 
1 5 2 f ,7 ! -difluorescein, or other fluorophore. In one embodiment, the functional 
group is an immunogenic molecule, i.e., one which is bound by antibodies 
specific for that molecule. In one embodiment, the functional group is not a 
radionuclide. 

The invention also includes a mutant hydrolase which comprises at least 
20 one amino acid substitution relative to a corresponding wild-type hydrolase, 
which substitution(s) renders the mutant hydrolase capable of forming a bond, 
e.g., a covalent bond with a substrate for the corresponding hydrolase, e.g., a 
substrate of the invention, which is more stable than the bond formed between a 
corresponding wild-type hydrolase and the substrate. 
25 In one embodiment, the mutant hydrolase of the invention comprises at 

least one amino acid substitution in a residue which, in the wild-type hydrolase, 
is associated with activating a water molecule, e.g., a residue in a catalytic triad 
or an auxiliary residue, wherein the activated water molecule cleaves the bond 
formed between a catalytic residue in the wild-type hydrolase and a substrate of 
30 the hydrolase. As used herein, an "auxiliary residue" is a residue which alters 
the activity of another residue, e.g., it enhances the activity of a residue that 
activates a water molecule. Residues which activate water within the scope of 
the invention include but are not limited to those involved in acid-base catalysis, 

6 
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for instance, histidine, aspartic acid and glutamic acid In another embodiment, 
the mutant hydrolase of the invention comprises at least one amino acid 
substitution in a residue which, in the wild-type hydrolase, forms an ester 
intermediate by nucleophilic attack of a substrate for the hydrolase. 
5 For example, wild-type dehalogenase DhaA cleaves carbon-halogen 

bonds in halogenated hydrocarbons (HaloC3-HaloCio). The catalytic center of 
DhaA is a classic catalytic triad including a nucleophile, an acid and a histidine 
residue. The amino acids in the triad are located deep inside the catalytic pocket 
of DhaA (about 10 A long and about 20 A 2 in cross section). The halogen atom 

10 in a halogenated substrate for DhaA, for instance, the chlorine atom of a Cl- 

alkane substrate, is positioned in close proximity to the catalytic center of DhaA. 
DhaA binds the substrate, likely forms an ES complex, and an ester intermediate 
is formed by nucleophilic attack of the substrate by Aspl06 (the numbering is 
based on the protein sequence of DhaA) of DhaA (Figure 1). His272 of DhaA 

15 then activates water and the activated water hydrolyzes the intermediate, 

releasing product from the catalytic center. As described herein, mutant DhaAs, 
e.g., a DhaA.H272F mutant, which likely retains the 3-D structure based on a 
computer modeling study and basic physico-chemical characteristics of wild- 
type DhaA (DhaA.WT), were not capable of hydrolyzing one or more substrates 

20 of the wild-type enzyme, e.g., for Cl-alkanes, releasing the corresponding 

alcohol released by the wild-type enzyme. As further described herein, mutant 
serine beta-lactamases, e.g., a blaZ.E166D mutant, a blaZ.N170Q mutant and a 
blaZ.E166D:N170Q mutant, were not capable of hydrolyzing one or more 
substrates of a wild-type serine beta-lactamase. 

25 Thus, in one embodiment of the invention, a mutant hydrolase is a 

mutant dehalogenase comprising at least one amino acid substitution in a residue 
which, in the wild-type dehalogenase, is associated with activating a water 
molecule, e.g., a residue in a catalytic triad or an auxiliary residue, wherein the 
activated water molecule cleaves the bond formed between a catalytic residue in 

30 the wild-type dehalogenase and a substrate of the dehalogenase. In one 

embodiment, at least one substitution is in a residue corresponding to residue 
272 in DhaA from Rhodococcus rhodochrous. A "corresponding residue" is a 
residue which has the same activity (function) in one wild-type protein relative 
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to a reference wild-type protein and optionally is in the same relative position 
when the primary sequences of the two proteins are aligned. For example, a 
residue which forms part of a catalytic triad and activates a water molecule in 
one enzyme may be residue 272 in that enzyme, which residue 272 corresponds 
5 to residue 73 in another enzyme, wherein residue 73 forms part of a catalytic 
triad and activates a water molecule. Thus, in one embodiment, a mutant 
dehalogenase of the invention has a phenylalanine residue at a position 
corresponding to residue 272 in DhaA from Rliodococcus rhodochrous. In 
another embodiment of the invention, a mutant hydrolase is a mutant 
1 0 dehalogenase comprising at least one amino acid substitution in a residue 
corresponding to residue 106 in DhaA from Khodococcus rhodochrous. For 
example, a mutant dehalogenase of the invention has a cysteine or a glutamate 
residue at a position corresponding to residue 106 in DhaA from Rhodococcus 
rhodochrous. In a further embodiment, the mutant hydrolase is a mutant 
1 5 dehalogenase comprising at least two amino acid substitutions, one in a residue 
corresponding to residue 106 and one in a residue corresponding to residue 272 
in DhaA from Rhodococcus rhodochrous. In yet a further embodiment, the 
mutant hydrolase is a mutant serine beta-lactamase comprising at least one 
amino acid substitution in a residue corresponding to residue 166 or residue 170 
20 in a serine beta-lactamase of Staphylococcus aureus PCI . 

The mutant hydrolase may be a fusion protein, e.g., a fusion protein 
expressed from a recombinant DNA which encodes the mutant hydrolase and at 
least one protein of interest or a fusion protein formed by chemical synthesis. 
For instance, the fusion protein may comprise a mutant hydrolase and an enzyme 
25 of interest, e.g., luciferase, RNasin or RNase, and/or a channel protein, a 

receptor, a membrane protein, a cytosolic protein, a nuclear protein, a structural 
protein, a phosphoprotein, a kinase, a signaling protein, a metabolic protein, a 
mitochondrial protein, a receptor associated protein, a fluorescent protein, an 
enzyme substrate, a transcription factor, a transporter protein and/or a targeting 
30 sequence, e.g., a myristilation sequence, a mitochondrial localization sequence, 
or a nuclear localization sequence, that directs the mutant hydrolase, for 
example, a fusion protein, to a particular location. The protein of interest may be 
fused to the N-terminus or the C-terminus of the mutant hydrolase. In one 
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embodiment, the fusion protein comprises a protein of interest at the N-terminus, 
and another protein, e.g., a different protein, at the C-terminus, of the mutant 
hydrolase. For example, the protein of interest may be a fluorescent protein or 
an antibody. Optionally, the proteins in the fusion are separated by a connector 
5 sequence, e.g., preferably one having at least 2 amino acid residues, such as one 
having 13 to 17 amino acid residues. The presence of a connector sequence in a 
fusion protein of the invention does not substantially alter the function of either 
protein in the fusion relative to the function of each individual protein. Thus, for 
a fusion of a mutant dehalogenase and Renilla luciferase, the presence of a 

1 0 connector sequence does not substantially alter the stability of the bond formed 
between the mutant dehalogenase and a substrate therefor or the activity of the 
luciferase. For any particular combination of proteins in a fusion, a wide variety 
of connector sequences may be employed. In one embodiment, the connector 
sequence is a sequence recognized by an enzyme, e.g., a cleavable sequence. 

1 5 For instance, the connector sequence may be one recognized by a caspase, e.g., 
DEVD (SEQ ID NO:64), or is a photocleavable sequence. 

In one embodiment, the fusion protein may comprise a protein of interest 
at the N-terminus and, preferably, a different protein of interest at the C-terminus 
of the mutant hydrolase. As described herein, fusions of a mutant DhaA with 

20 GST (at the N-terminus), a Flag sequence (at the C-terminus) and Renilla 

luciferase (at the N-terminus or C-terminus) had no detectable effect on bond 
formation between the mutant DhaA and a substrate for wild-type DhaA which 
includes a functional group. Moreover, a fusion of a Flag sequence and 
DhaA.H272F could be attached to a solid support via a streptavidin-biotin- 

25 Ci8H 32 04-DhaA.H272F bridge (an SFlag-ELISA experiment). Further, a fusion 
of Renilla luciferase (R.Luc) and DhaA.H272F could be attached to Magnesil™ 
particles coated with a substrate for wild-type DhaA which includes a functional 
group. In addition, the attached fusion comprising R.Luc was shown to be 
enzymatically active. 

30 Exemplary proteins of interest include, but are not limited to, an 

immunogenic protein, fluorescent protein, selectable marker protein, membrane 
protein, cytosolic protein, nuclear protein, structural protein, enzyme, e.g., 
RNase, enzyme substrate, receptor protein, transporter protein, transcription 
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factor, channel protein, e.g., ion channel protein, phospho-protein, kinase, 
signaling protein, metabolic protein, mitochondrial protein, receptor associated 
protein, nucleic acid binding protein, extracellular matrix protein, secreted 
protein, receptor ligand, serum protein, or a protein with reactive cysteines. 
5 The invention also includes compositions and kits comprising a substrate 

for a hydrolase which includes a linker, a substrate for a hydrolase which 
includes one or more functional groups and optionally a linker, a linker which 
includes one or more functional groups, a substrate for a hydrolase which lacks 
one or more functional groups and optionally includes a linker, a linker, or a 
1 0 mutant hydrolase, or any combination thereof. For example, the invention 

includes a solid support comprising a substrate of the invention, a kit comprising 
a substrate of the invention, a kit comprising a vector encoding a dehalogenase 
of the invention, or a kit comprising a vector encoding a serine beta-lactamase of 
the invention. 

1 5 Also provided is an isolated nucleic acid molecule (polynucleotide) 

comprising a nucleic acid sequence encoding a hydrolase. In one embodiment, 
the isolated nucleic acid molecule comprises a nucleic acid sequence which is 
optimized for expression in at least one selected host. Optimized sequences 
include sequences which are codon optimized, i.e., codons which are employed 

20 more frequently in one organism relative to another organism, e.g., a distantly 
related organism, as well as modifications to add or modify Kozak sequences 
and/or introns, and/or to remove undesirable sequences, for instance, potential 
transcription factor binding sites. In one embodiment, the polynucleotide 
includes a nucleic acid sequence encoding a dehalogenase, which nucleic acid 

25 sequence is optimized for expression is a selected host cell. In one embodiment, 
the optimized polynucleotide no longer hybridizes to the corresponding non- 
optimized sequence, e.g., does not hybridize to the non-optimized sequence 
under medium or high stringency conditions. In another embodiment, the 
polynucleotide has less than 90%, e.g., less than 80%, nucleic acid sequence 

30 identity to the corresponding non-optimized sequence and optionally encodes a 
polypeptide having at least 80%, e.g., at least 85%, 90% or more, amino acid 
sequence identity with the polypeptide encoded.by the non-optimized sequence. 
Constructs, e.g., expression cassettes, and vectors comprising the isolated 
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nucleic acid molecule, as well as kits comprising the isolated nucleic acid 
molecule, construct or vector are also provided." 

Further provided is a method of expressing a mutant hydrolase of the 
invention. The method comprises introducing to a host cell a recombinant 
5 nucleic acid molecule encoding a mutant hydrolase of the invention so as to 
express the mutant hydrolase. In one embodiment, the mutant hydrolase may be 
isolated from the cell. The mutant hydrolase may be expressed transiently or 
stably, constitutively or under tissue-specific or drug-regulated promoters, and 
the like. Also provided is an isolated host cell comprising a recombinant nucleic 

1 0 acid molecule encoding a mutant hydrolase of the invention. 

In one embodiment, the invention provides a method to detect or 
determine the presence or amount of a mutant hydrolase. The method includes 
contacting a mutant hydrolase with a hydrolase substrate which comprises one or 
more functional groups. The mutant hydrolase comprises at least one amino acid 

1 5 substitution relative to a corresponding wild-type hydrolase, wherein the at least 
one amino acid substitution results in the mutant hydrolase forming a bond with 
the substrate which is more stable than the bond formed between the 
corresponding wild-type hydrolase and the substrate, and wherein the at least 
one amino acid substitution in the mutant hydrolase is a substitution at an amino 

20 acid residue in the corresponding wild-type hydrolase that is associated with 
activating a water molecule which cleaves the bond formed between the 
corresponding wild-type hydrolase and the substrate or at an amino acid residue 
in the corresponding wild-type hydrolase that forms an ester intermediate with 
the substrate. The presence or amount of the functional group is detected or 

25 determined, thereby detecting or determining the presence or amount of the 
mutant hydrolase. In one embodiment, the mutant hydrolase is in or on the 
surface of a cell. In another embodiment, the mutant hydrolase is in a cell lysate. 

Also provided are methods of using a mutant hydrolase and a substrate 
for a corresponding hydrolase which includes one or more functional groups, 

30 e.g., to isolate a molecule or to detect or determine the presence or amount of, 
location, e.g., intracellular, subcellular or extracellular location, or movement of 
certain molecules in cells. In one embodiment, a method to isolate a molecule of 
interest in a sample is provided. The method includes contacting a sample with a 

11 



WO 2004/072232 



PCT/US2004/002607 



fusion protein comprising a mutant hydrolase and a protein which binds a 
molecule of interest with a hydrolase substrate which comprises one or more 
functional groups. The mutant hydrolase comprises at least one amino acid 
substitution relative to a corresponding wild-type hydrolase, wherein the at least 
5 one amino acid substitution results in the mutant hydrolase forming a bond with 
the substrate which is more stable than the bond formed between the 
corresponding wild-type hydrolase and the substrate, and wherein the at least 
one amino acid substitution in the mutant hydrolase is a substitution at an amino 
acid residue in the corresponding wild-type hydrolase that is associated with 
1 0 activating a water molecule which cleaves the bond formed between the 

corresponding wild-type hydrolase and the substrate or at an amino acid residue 
in the corresponding wild-type hydrolase that forms an ester intermediate with 
the substrate. In one embodiment, at least one functional group is a solid support 
or a molecule which binds to a solid support. In one embodiment, the sample 
15 contains intact cells while in another embodiment, the sample is a cell lysate or 
subcellular fraction. Then the molecule of interest is isolated. 

For example, the invention includes method to isolate a protein of 
interest. The method includes contacting a fusion protein comprising a mutant 
hydrolase and a protein of interest with a hydrolase substrate which comprises at 
20 least one functional group. The mutant hydrolase comprises at least one amino 
acid substitution relative to a corresponding wild-type hydrolase, wherein the at 
least one amino acid substitution results in the mutant hydrolase forming a bond 
with the substrate which is more stable than the bond formed between the wild- 
type hydrolase and the substrate, and wherein the at least one amino acid 
25 substitution in the mutant hydrolase is a substitution at an amino acid residue in 
the wild-type hydrolase that is associated with activating a water molecule which 
cleaves a bond formed between the wild-type hydrolase and the substrate or at an 
amino acid residue in the wild-type hydrolase that forms an ester intermediate 
with the substrate. In one embodiment, at least one functional group is a solid 
30 support or a molecule which binds to a solid support. Then the protein of 
interest is isolated. 

In another embodiment, the invention includes a method to identify an 
agent that alters the interaction of a protein of interest with a molecule suspected 
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of interacting with the protein of interest The method includes contacting at 
least one agent with the molecule suspected of interacting with the protein of 
interest, a fusion protein comprising mutant hydrolase and the protein of interest, 
and a hydrolase substrate which comprises one or more functional groups. The 
5 mutant hydrolase comprises at least one amino acid substitution relative to a 
corresponding wild-type hydrolase, wherein the at least one amino acid 
substitution results in the mutant hydrolase forming a bond with the substrate 
which is more stable than the bond formed between the corresponding wild-type 
hydrolase and the substrate, and wherein the at least one amino acid substitution 

10 in the mutant hydrolase is a substitution at an amino acid residue in the 

corresponding wild-type hydrolase that is associated with activating a water 
molecule which cleaves a bond formed between the corresponding wild-type 
hydrolase and the substrate at an amino acid residue in the wild-type hydrolase 
that forms an ester intermediate with the substrate. In one embodiment at least 

1 5 one functional group is a solid support or a molecule which binds to a solid 

support. Then it is determined whether the agent alters the interaction between 
the protein of interest and the molecule suspected of interacting with the protein 
of interest. 

Moreover, a substrate of the invention bound to a solid support or a 
20 mutant hydrolase bound to a solid support may be used to generate protein 
arrays, cell arrays, vesicle/organelle arrays and cell membrane arrays. 

The invention thus provides methods to monitor the expression, location 
and/or movement (trafficking) of proteins in a cell as well as to monitor changes 
in miCToenvironments within a cell. In one embodiment, the use of a mutant 
25 hydrolase and a substrate of the invention permits functional analysis of proteins, 
e.g., ion channels. In another embodiment, the use of two pairs of a mutant 
hydrolase/substrate permits multiplexing, simultaneous detection, and FRET- or 
BRET-based assays. For example, mutant dehalogenases with substitutions at 
different residues of a catalytic triad may each preferentially bind certain 
30 substrates of the invention but not others or a mutant dehalogenase and a mutant 
beta-lactamase may be employed with their respective substrates, thus permitting 
multiplexing. Other applications include capturing the stable complex which 
results from contacting the mutant hydrolase with a corresponding substrate of 

13 



WO 2004/072232 



PC17US2004/002607 



the invention, on a solid substrate for analytical or industrial purposes (e.g., to 
study kinetic parameters of the tethered enzyme, to generate enzyme 
chains/arrays, to metabolize industrial components, and the like), to detect 
protein-protein interactions, to determine the effect of different 
5 compounds/drugs on an interaction between a fusion protein comprising a 
protein of interest and a mutant hydrolase with other molecules, to isolate or 
purify molecules which bind to a protein of interest fused to the mutant 
hydrolase, or to isolate or purify cells, organelles or fragments thereof. For 
example, a protein of interest may be fused to a mutant hydrolase and then 
10 linked to a solid support via the specific interaction of a functional group which 
is a ligand for an acceptor group and is present in a substrate of the invention, 
with an acceptor group present on the solid support. Such a substrate may be 
contacted with the fusion protein prior to contact with the solid support, 
contacted with the solid support prior to contact with the fusion protein, or 
1 5 simultaneously contacted with the fusion protein and the solid support. Such a 
system permits the resulting complex to be employed to detect or isolate 
molecules which bind to the protein of interest The binding molecule may be a 
protein, e.g., a fusion of the binding protein and a functional group, e.g., GFP, 
luciferase, an antibody, e.g., one conjugated to horseradish peroxidase (HRP), 
20 alkaline phosphatase (AP) or a fluorophore. 

To isolate, sort or purify cells, the mutant hydrolase may be expressed on 
the outside surface of cells (e.g., via a fusion with a plasma membrane protein). 
To isolate, purify or separate organelles, the mutant hydrolase is expressed on 
the cytosolic surface of the organelle of interest In another embodiment, to 
25 create an optimal platform for growing different cells, the mutant hydrolase is 
fused with an extracellular matrix component or an outer membrane protein and 
tethered to a three-dimensional cell culture or a platform for tissue engineering. 
As an example, primary neurons or embryonic stem cells may be grown on the 
platform to form a feeder layer. 
30 Other applications include detecting or labeling cells. Thus, the use of a 

mutant hydrolase and a corresponding substrate of the invention permits the 
detection of cells, for instance, to detect cell migration in vitro or in vivo after 
implantation or injection into animals (e.g., angiogenesis/chemotaxis assays, 
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migration of implanted neurons, normal, malignant, or recombinant^ modified 
cells implanted/injected into animals, and tbe like), and live cell imaging 
followed by immunocytochemistry. In another embodiment, the invention 
provides a method to label newly synthesized proteins. For example, cells 
5 comprising a vector which expresses a mutant hydrolase of the invention or a 
fusion thereof, are contacted with a substrate for the hydrolase which lacks a 
functional group. Cells are then contacted with an agent, e.g., an inducer of gene 
expression, and a substrate for the hydrolase which contains one or more 
functional groups. The presence, amount or location of the mutant hydrolase or 

10 fusion thereof is then, detected or determined. The presence, amount or location 
of the mutant hydrolase or fusion thereof is due to newly synthesized mutant 
hydrolase or a fusion thereof. Alternatively, cells comprising a vector which 
expresses a mutant hydrolase of the invention or a fusion thereof, are contacted 
with a substrate for the hydrolase having a functional group, e.g., a green 

1 5 fluorophore, then contacted with an agent and a substrate havinga different 
functional group, e.g., a red fluorophore. In one embodiment, the mutant 
hydrolase is fused to a membrane localization signal and so can be employed to 
monitor events in or near the membrane. 

Accordingly, the invention provides a method to label a cell. The 

20 method includes contacting a cell comprising a mutant hydrolase with a 

hydrolase substrate which comprises one or more functional groups. The mutant 
hydrolase comprises at least one amino acid substitution relative to a 
corresponding wild-type hydrolase, wherein the at least one amino acid 
substitution results in the mutant hydrolase forming a bond with the substrate 

25 which is more stable than the bond formed between the corresponding wild-type 
hydrolase and the substrate, and wherein the at least one amino acid substitution 
in the mutant hydrolase is a substitution at an amino acid residue in the 
corresponding wild-type hydrolase that is associated with activating a water 
molecule which cleaves a bond formed between the corresponding wild-type 

30 hydrolase and the substrate or at an amino acid residue in the corresponding 

wild-type hydrolase that forms an ester intermediate with the substrate. Then the 
presence or amount of the functional group is detected or determined. 
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Cells expressing selectable marker proteins, such as ones encoding 
resistance to neomycin, hygromycin, or puromycin, are used to stably transform 
cells with foreign DNA. It may be desirable to observe which cells contain 
selectable marker proteins as well as fluorescently labeled molecules. For 
5 instance, it may be preferable to label the selectable marker protein with a 

fluorescent molecule that is added exogenously to living cells. By this method, 
the selectable marker protein becomes visible when only when needed by 
addition of the fluorophore, and the fluorescence will subsequently be lost when 
selectable marker proteins are naturally regenerated through cellular metabolism. 
1 0 Thus, in one embodiment, the invention provides a method for labeling a cell 
which expresses a selectable marker protein. The method includes providing a 
cell comprising an expression cassette comprising a nucleic acid sequence 
encoding a fusion protein. The fusion protein comprises a selectable marker 
protein, e.g., one which confers resistance to at least one antibiotic, and a second 
1 5 protein that is capable of stably and optionally irreversibly binding a substrate or 
a portion thereof which includes an optically detectable molecule. For instance, 
the protein may be an alkyl transferase which irreversibly transfers an alkyl 
group and an optically detectable molecule from a substrate to itself, thereby 
labeling the alkyl transferase, e.g., an alkyl transferase such as 0 6 -alkylguanine 
20 DNA alkyltransferase. Exemplary proteins useful in this embodiment of the 
invention include, but are not limited to, alkyl transferases, peptidyl glycine- 
alpha-amidating monoxygenases, type I topoisomerases, hydrolases, e.g., serine 
and epoxide hydrolases as well as the mutant hydrolases described herein, 
aminotransferases, cytochrome P450 monooxygenases, acetyl transferases, 
25 decarboxylases, oxidases, e.g., monoamine oxidases, reductases, e.g., 

ribonucleotide reductase, synthetases, e.g., cyclic ADP ribose synthetase or 
thymidylate synthetase, dehydrogenases, e.g., aldehyde dehydrogenase, 
synthases, e.g:, nitric oxide synthase (NOS), lactamases, cystathionine gamma- 
lyases, peptidases, e.g., carboxypeptidase A, aromatase, proteases, e.g., serine 
30 protease, xylanases, glucosidases, mannosidases, and demethylases and other 
proteins, including wild-type proteins, which form an irreversible or otherwise 
stable bond with one or more substrates, e.g., enzymes which are capable of 
mechanism-based inactivation. Thus, in this embodiment, a stable bond, i.e., 
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one which is formed between a substrate and a wild-type or mutant enzyme, has 
a ti4 of at least 30 minutes and preferably at least 4 hours, and iip to at least 10 
hours, and is resistant to disruption by washing, protein denaturants, and/or high 
temperatures, e.g., the bond is stable to boiling in SDS. 
5 The cell which expresses the fusion protein is contacted with the 

substrate so as to label the cell. In one embodiment, the cell is fixed prior to 
contact with the substrate. In another embodiment, the substrate and fixative are 
contacted with the cell at the same time. In yet another embodiment, the fixative 
is added to the cell after the cell is contacted with the substrate. In one 

1 0 embodiment, the fusion protein forms an ester bond with the substrate. In 
another embodiment, the fusion protein forms a thioester bond with the 
substrate. Also provided is a fusion gene encoding the fusion protein, and a cell 
which expresses the fusion protein. 

When performing image analysis on a cell, it may be desirable to fix the 

1 5 cell with a preservative (fixative) such as paraformaldehyde, acetone or 

methanol which generally maintains most features of cellular structure. Such 
fixed cells are then often analyzed by adding fluorescent stains or fluorescently 
labeled antibodies to reveal specific structures within the cells. Another method 
to fluorescently label cells is to express a fluorescent protein, e.g., GFP, in cells 

20 prior to fixation. Unfortunately, the efficient fluorescence of these proteins is 
dependent on protein structure, which can be disrupted by preservatives, thus 
decreasing the efficiency of imaging in those cells. 

Accordingly, the invention provides a method for labeling a cell with a 
functional group, e.g., fluorophore. The method includes providing a cell which 

25 expresses a mutant hydrolase of the invention or a fusion thereof, and contacting 
the cell with a hydrolase substrate which includes at least one functional group. 
In one embodiment, the cell is fixed prior to contact with the substrate. In 
another embodiment, the substrate and fixative are contacted with the cell at the 
same time. In yet another embodiment, the fixative is added to the cell after the 

30 cell is contacted with the substrate. Then the presence or location of the mutant 
hydrolase, or fusion thereof, in the cell is detected or determined. In one 
embodiment, the mutant hydrolase forms an ester bond with the substrate, while 
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in another embodiment, the mutant hydrolase forms a tiiioester bond with the 
substrate. 

The invention also provides processes and intermediates disclosed herein 
that are useful for preparing compounds, compositions, nucleic acids, proteins, 
5 or other materials of the invention. 

Brief Description of the Figures 

Figure 1 is a schematic of a reaction in the catalytic triad of Rhodococcus 
rhodochrous dehalogenase with an alkylhalide substrate. 
1 0 Figure 2 shows a three-dimensional model of a wild-type DhaA 

Rhodococcus rhodochrous dehalogenase and four mutant DhaAs (H283Q, G, A 
or F). A cyan ribbon is a 3-D model of the DhaA.WT based on the crystal 
structure of this protein (Newman et al., 1999) (panel A). The purple ribbon is a 
3-D model of the H272Q, H272G and H272A mutants (panel A), or a 3-D model 
15 of the H272F mutant (panel B). Three-dimensional models were generated by 
calculating a Molecular Probability Density Function followed by several 
optimization steps including Restrained Stimulated Annealing Molecular 
Dynamics (MD) scheme. 3-D modeling was done on Silicon Graphics 
computer-station using software Insightn (USA). 
20 Figure 3 shows the purification of wild-type and mutant DhaA proteins. 

GST-DhaA. WT-Flag (odd numbered lanes) and GST-DhaA.H272F-Flag (even 
numbered lanes) fusion proteins were found to be soluble and efficiently purified 
on GSS-Sepharose 4FF (lanes 3 and 4-crude E. coli supernatant; lanes 5 and 6- 
washes; lanes 7 through 10-purified proteins). Treatment of the fusion proteins 
25 with Factor Xa led to the formation of two proteins, GST and DhaA (WT or 

mutant; lanes 1 1 and 12, respectively). Moreover, GST was efficiently removed 
on GSS-Sepharose 4FF (WT or mutant; lanes 13 and 14, respectively). All 
proteins had the predicted molecular weight. 

Figure 4 illustrates the hydrolysis of 1-Cl-butane by wild-type DhaA and 
30 mutant DhaAs. 

Figure 5 shows precipitation of DhaA. WT and DhaA.H272F/A/G/ Q 
mutants with various concentrations of (NIL^SO* Lanes 1, 5, and 9, 0 % 
(NH4)2S04; lanes 2, 6, and 10, 10 % (NH4) 2 S0 4 ; lanes 3, 7, and 1 1, 10-45 % 
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(NRO2SO4; and lanes 4, 8, and 12, 45-70 % (NR&SO* Panel A: lanes 1-4, 
DhaA.WT; lanes 5-8, DhaA.H272G; and lanes 9- 12, DhaA.H272Q. Panel B: 
lanes 1-4, DhaA.WT; lanes 5-8, DhaA.H272F; and lanes 9-12, DhaA.H272A. 
Figure 6 depicts the substrate specificity of wild-type DhaA. Using a 
5 phenol red-based assay (E558), ^ e initial rate of the reaction was determined 
during the first 60 seconds after enzyme addition by four 15 second readings. 

Figure 7 shows substrates for DhaA which include a functional group 
(e.g., 5-(and 6-)-carboxyfluorescein (RAM), Anth (anthracene) or biotin) and a 
linker. 

10 Figure 8 A shows a HPLC separation of products of FAM-C14H24O4-CI 

hydrolysis by wild-type DhaA. 

Figure 8B shows a HPLC analysis of product (as a percent of substrate) 
produced by wild-type DhaA hydrolysis of FAM-C14H24O4-CI over time. 

Figure 9 shows SDS-PAGE analysis of the binding of wild-type DhaA 

15 (lanes 1, 3, and 5 in panel A and lanes 1-8 in panel B) and mutant DhaA 
(DhaA.H272F); (lanes 2, 4, and 6 in panel A and lanes 9-14 in panel B), to 
TAMRA-C14H24O4-CI (lanes 1 and 2 in panel A); ROX-C14H24O4-CI (lanes 3 
and 4 in panel A); FAM-Q4H24O4-CI (lanes 5 and 6 in panel A); or biotin- 
C18H32O4-CI (panel B). The concentration of biotin-Cigl^CVCl in panel B as: 0 

20 nM (lanes 1 and 8), 125 pM (lanes 2 and 9) 25 pM (lanes 3 and 10), 5 pM (lanes 
4 and 11), 1 \iM (lanes 5 and i2), 0.2 pM (lanes 6 and 13), and 0.04 |iM (lanes 7 
and 14). 

Figure 10 illustrates that pretreatment of a mutant DhaA with a substrate, 
biotin-CiglfeCVCl, blocks binding of another substrate. DhaA.WT-lanes 1 and 
25 2; DhaA.H272 mutants: F, lanes 3 and 4; G, lanes 5 and 6; A, lanes 7 and 8; and 
Q, lanes 9 and 10. Samples 2, 4, 6, 8, and 10 were pretreated with biotin- 
C18H32O4-CI. 

Figure 1 1 shows MALDI-TOF analysis of enzyme substrate complexes. 
Mass spectra of GST-DhaA.WT or GST-DhaA.H272F incubated with FAM- 
30 C14H24O4-CI. 

Figure 12 illustrates SDS-PAGE analysis of the binding properties of 
DhaA mutants with substitutions at residue 106, and DhaA mutants with 
substitutions at residue 106 and residue 272, to TAMRA-C14H24O4-CL 2 \ig of 
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protein and 25 uM TAMRA-C14H24O4-CI in 32 jd were incubated for one hour 
at room temperature. 10 ul of each reaction was loaded per lane. Lane 1- 
DhaAD106C; lane 2-DhaA.D106C: H272F; lane 3-DhaA.D106E; lane 4- 
DhaA.D106E:H272F; lane 5-DhaA.D106Q; lane 6-DhaA.D106Q:H272F; lane 
5 7-DhaA.WT; and lane 8-DhaA.H272F. The gel was imaged with a 570 nm 
filter. 

Figure 13 depicts analysis of Renilla luciferase activity in samples having 
a fusion of luciferase and a mutant DhaA tethered to a solid support (a 
streptavidin coated plate). Capture of the fusion was accomplished using a 
10 substrate of DhaA (i.e., biotin-Ci 8 H320 4 -Cl). No activity was found in fractions 
with a fusion of Renilla luciferase and wild-type DhaA. 

Figure 14 shows SDS-PAGE analysis of two-fold serial dilutions of E. 
coli expressing either wild-type DhaA (DhaAWT-Flag, lanes 1-4 of each panel) 
or mutant DhaA.H272F (DhaA.H272F-Flag, lanes 5-7 of each panel) treated 
1 5 with biotin-C,gH 3 204-Cl (panel A) or TAMRA-C12H24O4-CI (panel B) in vivo. 
Arrows mark proteins with M r corresponding to M r of DhaA-Flag. 

Figure 15 shows the binding of TAMRA-C 12 H240 4 -C1 to eukaryotic cell 
proteins in vivo. Two-fold serial dilutions of proteins from CHO-K1 cells 
expressing either DhaA. WT-Flag (lanes 1-4) or DhaA.H272F-Flag (lanes 5-8) 
20 were treated with TAMRA-C 12 H 24 04-C1. Arrows mark proteins with Mr 
corresponding to Mr of DhaA-Flag. 

Figure 16 illustrates the permeability of TAMRA-C12H24O4-CI to CHO- 
Kl cells. CHO-K1 cells (A, bright field image) were treated with TAMRA- 
C12H28O4-CI (25 uM, for 5 minutes at 37°C) and quickly washed with PBS (panel 
25 B). Panel C shows the cells after the washing procedure. 

Figure 17 shows images of cells transfected with GFP-connector- 
DhaA.WT-FlagorGFP-connector-DhaA.H272F-Flag. CHO-K1 cells were 
transfected with DNA coding GFP-connector-DhaA. WT-Flag (panels A-C) or 
GFP-connector-DhaA.H272F-FIag (panels D-F) and treated with TAMRA- 
30 C12H28O4-CI. Panels A, D-bright field; panels B, E-GFP filter set; and panels C, 
F-TAMRA filter set 

Figure 18 shows Western blot analysis of proteins from cells transfected 
with GFP-connector-DhaA WT-Flag (lanes 1-4) or GFP-connector- 
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DhaA.H272F-Flag (lanes 5-8). CHO-K1 cells were transfected with either GFP- 
connector-DhaA. WT-Flag or GFP-connector-DhaA.H272F-iFlag and then treated 
with TAMRA~Ci4H 2 40 4 -Cl (25 pM) for 0, 5, 15 or 60 minutes, washed with 
PBS (4 x 1.0 ml), and collected in SDS-sample buffer. The samples were 
5 resolved on SDS-PAGE, and analyzed on a fluoroimager. Lanes 1-4, GFP- 
connector-DhaA. WT-Flag treated for 0, 5, 15, or 60 minutes, respectively. 
Lanes 5-8, GFP-connector-DhaA.H272F-Flag treated for 0, 5, 15, 60 minutes, 
respectively. Arrows mark proteins with M r corresponding to M r of GFP- 
connector-DhaA.H272F-Flag. 

10 Figure 19 illustrates the toxicity of selected substrates (panel A, TAMRA 

and panel B, ROX) for CHO-K1 cells. 

Figure 20 illustrates a reaction scheme for a serine beta-lactamase. The 
reaction begins with the formation of a precovalent encounter complex (Figure 
19A), and moves through a high-energy acylation tetrahedral intermediate 

1 5 (Figure 1 9B) to form a transiently stable acyl-enzyme intermediate, forming an 
ester through the catalytic residue Ser70 (Figure 19C). Subsequently, the acyl- 
enzyme is attacked by hydrolytic water (Figure 19D) to form a high-energy 
deacylation intermediate (Figure 19E) (Minasov et al., 2002), which collapses to 
form the hydrolyzed product (Figure 19F). The product is then expelled, 

20 regenerating free enzyme. 

Figure 21 shows hydrolysis of FAP by GST-blaZ over time. 
Figure 22 shows the binding of bocellin to fusions of GST and 
blaZ.E166D, blaZ.N170Q or blaZ.E166D:N170Q. Lane 1-dye/no blaZ; lane 2- 
blaZ.WT; lane 3- blaZ.E166D; lane 4-blaZ.N170Q; and lane 5- 

25 blaZ.E166D:N170Q. 

Figure 23 shows the binding of CCF2 to fusions of GST and 
blaZ.E166D, blaZ.N170Q or blaZ.E166D:N170Q. Lane 1-dye/no blaZ; lane 2- 
GST-blaZ.WT; lane 3-GST-blaZ.E166D; lane 4-GST-blaZ.N170Q; and lane 5- 
GST- blaZ.E166D:N170Q. 

30 Figure 24 provides fluorescence and DIC images of living CHO-K1 cells 

transfected with a construct encoding GFP-connector-DhaA.H272F-NLS3 and 
stained with TAMRA-C14H24O4-CI. TAMRA filter-top left; GFP filter-top 
right; "A" and "B" overlaid-bottom left; overlaid image "C" and DIC image of 
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the cell-bottom right. NLS3 = tandem repeat of a nuclear localization sequence 
from SV40 T antigen. 

Figure 25 shows fluorescence images of living CHO-K1 cells transfected 
with a construct encoding GFP-P-arrestin2 (left) and a construct encoding 
5 DhaA.H272F-P-arrestin2 and stained with TAMRA-Q4H24O4 (right). 

Figure 26 shows an SDS-PAGE analysis of DhaA expression in E. coli. 
Lanes: 1, Molecular weight standards; 2, Wild-type DhaA crude lysate; 3, Wild- 
type DhaA cell-free lysate; 4, DhaA.H272F crude lysate; 5, DhaA.H272F cell- 
free lysate; 6, vector control crude lysate; 7, vector control cell-free lysate; 8, 
10 DhaA.E130Q CI mutant crude lysate; 9, DhaA.E130Q CI mutant cell-free 
lysate; 10, DhaA.E130L A5 mutant crude lysate; 1 1, DhaA.E130L A5 mutant 
cell-free lysate; 12, DhaA.E130A A12 mutant crude lysate; 13, DhaA.E130A 
A12 mutant cell-free lysate; 14, Molecular weight standards. The arrow indicates 
the location of the DhaA protein, -s, lysate before centrifugation; +s, lysate after 
15 centrifugation. 

Figure 27 shows an immunoblot analysis of DhaA containing lysates. 
Lanes: 1, Wild-type DhaA crude lysate; 2, Wild-type DhaA cell-free lysate; 3, 
DhaA.H272F crude lysate; 4, DhaA.H272F cell-free lysate; 5, vector control 
crude lysate; 6, vector control cell-free lysate; 7, Molecular weight standards; 8, 
20 DhaA.E130Q CI mutant crude lysate; 9, DhaA.E130Q CI mutant cell-free 
lysate; 10, DhaA.E130L A5 mutant crude lysate; 1 1, DhaA.E130L A5 mutant 
cell-free lysate; 12, DhaA.E130A A12 mutant crude lysate; 13, DhaAE130A 
A12 mutant cell-free lysate; 14, Molecular weight standards. The arrow indicates 
the location of the DhaA protein. 
25 Figure 28 provides fluoroimage analysis of in vitro covalent alkyl- 

enzyme formation. Lanes: 1, Fluorescent molecular weight standards; 2, DhaA 
wild-type; 3, DhaA.H272F mutant; 4, DhaA- (vector only control); 5, 
DhaA.E130Q mutant; 6, DhaA.E130L mutant; 7, DhaA.E130A mutant The 
arrow indicates the location of the fluorescent enzyme-alkyl covalent 
30 intermediate. 

Figure 29 provides fluoroimage analysis of covalent alkyl-enzyme 
formation in whole cells. Lanes: 1, Fluorescent molecular weight standards; 2, 
DhaA wild-type; 3, DhaA.H272F mutant; 4, DhaA- (vector only control); 5, 
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DhaA.E130Q mutant; 6, DhaA.E130L mutant; 7, DhaA.E130A mutant; 8, 
Fluorescent molecular weight standards. The arrow indicates the location of the 
fluorescent enzyme-alkyi covalent intermediate. 

Figures 30 A-B show Western blot analyses of DhaA-Flag captured on 
5 streptavidin (SA) coated beads. CHO-K1 cells transiently expressing 

DhaAH272F-Flag were treated with (A) or without (B) biotin-Ci 8 H 3 20 4 -Cl (25 
HM, 0.1% DMSO, 60 minutes, 37°C). Excess biotin-CigKbCU-Cl was washed 
out, cells were lysed, and 1 0 pi of cell lysate was incubated with 5 \il of S A- 
coated beads (Pierce) for 60 minutes at room temperature (RT), Cell lysates 

10 (lane 1), proteins which were not bound to beads (lane 2), and proteins which 
were bound to beads (lane 3) were resolved on SDS-PAGE, transferred to 
nitrocellulose membrane, and probed with anti-Flag antibody (Sigma). 

Figures 30 C-D illustrate analyses of hRXuc-DhaA captured on SA 
coated beads. CHO-K1 cells transiently expressing hRXuc-connector- 

1 5 DhaA.H272F-Flag were treated with or without biotm-Ci 8 H320 4 -Cl (25 \iM, 
0.1% DMSO, 60 minutes, 37°C). Cells were lysed, and 10 \il of cell lysate was 
incubated with 5 pi of SA-coated beads (Pierce) for 60 minutes at room 
temperature. Unbound material was washed out, and hR.Luc activity determined 
using Promega's "Renilla Luciferase Assay System*' (C) or captured hRXuc 

20 analyzed by Western blot (D). C) Column 1 , cells treated with biotin-CigHbiCV 
Cl, and excess biotin-CigH^CVCl washed out; column 2, untreated cells; and 
column 3, cells treated with biotin-Ci8H 3 204-Cl without washing out excess 
biotin-Ci8H 32 04-Cl. D) Cell lysate (lane 1), proteins which were not bound to 
beads (lane 2), and proteins which were bound to beads (lane 3) were resolved 

25 on SDS-PAGE, transferred to nitrocellulose membrane, and probed with anti- 
RXuc antibody (Chemicon). 

Detailed Description of the Invention 

Definitions 

30 A "nucleophile" is a molecule which donates electrons. 

A "selectable marker protein" encodes an enzymatic activity that confers 
to a cell the ability to grow in medium lacking what would otherwise be an 
essential nutrient (e.g., the TRP1 gene in yeast cells) or in a medium with an 
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antibiotic or other drug, i.e., the expression of the gene encoding the selectable 
marker protein in a cell confers resistance to an antibiotic or drug to that cell 
relative to a corresponding cell without the gene. When a host cell must express 
a selectable marker to grow in selective medium, the marker is said to be a 
5 positive selectable marker (e.g., antibiotic resistance genes which confer the 
ability to grow in the presence of the appropriate antibiotic). Selectable markers 
can also be used to select against host cells containing a particular gene (e.g., the 
sacB gene which, if expressed, kills the bacterial host cells grown in medium 
containing 5% sucrose); selectable markers used in this manner are referred to as 

10 negative selectable markers or counter-selectable markers. Common selectable 
marker gene sequences include those for resistance to antibiotics such as 
ampicillin, tetracycline, kanamycin, puromycin, bleomycin, streptomycin, 
hygromycin, neomycin, Zeocin™, and the like. Selectable auxotrophic gene 
sequences include, for example, hisD, which allows growth in histidine free 

1 5 media in the presence of histidinol. Suitable selectable marker genes include a 
bleomycin-resistance gene, a metallothionein gene, a hygromycin B- 
phosphotransferase gene, the AURI gene, an adenosine deaminase gene, an 
aminoglycoside phosphotransferase gene, a dihydrofolate reductase gene, a 
thymidine kinase gene, a xanthine-guanine phosphoribosyltransferase gene, and 
20 the like. 

A "nucleic acid 11 , as used herein, is a covalently linked sequence of 
nucleotides in which the 3' position of the pentose of one nucleotide is joined by 
a phosphodiester group to the 5' position of the pentose of the next, and in which 
the nucleotide residues (bases) are linked in specific sequence, i.e., a linear order 

25 of nucleotides. A "polynucleotide", as used herein, is a nucleic acid containing a 
sequence that is greater than about 100 nucleotides in length. An 
"oligonucleotide" or "primer", as used herein, is a short polynucleotide or a 
portion of a polynucleotide. The term "oligonucleotide" or "oligo" as used 
herein is defined as a molecule comprised of 2 or more deoxyribonucleotides or 

30 ribonucleotides, preferably more than 3, and usually more than 10, but less than 
250, preferably less than 200, deoxyribonucleotides or ribonucleotides. The 
oligonucleotide may be generated in any manner, including chemical synthesis, 
DNA replication, amplification, e.g., polymerase chain reaction (PCR), reverse 
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transcription (RT), or a combination thereof A "primer" is an oligonucleotide 
which is capable of acting as a point of initiation for nucleic acid synthesis when 
placed under conditions in which primer extension is initiated. A primer is 
selected to have on its 3' end a region that is substantially complementary to a 
5 specific sequence of the target (template). A primer must be sufficiently 
complementary to hybridize with a target for primer elongation to occur. A 
primer sequence need not reflect the exact sequence of the target. For example, a 
non-complementary nucleotide fragment may be attached to the 5' end of the 
primer, with the remainder of the primer sequence being substantially 

1 0 complementary to the target Non-complementary bases or longer sequences can 
be interspersed into the primer provided that the primer sequence has sufficient 
complementarity with the sequence of the target to hybridize and thereby form a 
complex for synthesis of the extension product of the primer. Primers matching 
or complementary to a gene sequence may be used in amplification reactions, 

15 RT-PCR and the like. 

Nucleic acid molecules are said to have a "5'-terminus" (5' end) and a 
"3 '-terminus" (3' end) because nucleic acid phosphodiester linkages occur to the 
5' carbon and 3' carbon of the pentose ring of the substituent mononucleotides. 
The end of a polynucleotide at which a new linkage would be to a 5' carbon is its 

20 5' terminal nucleotide. The end of a polynucleotide at which a new linkage 
would be to a 3' carbon is its 3' terminal nucleotide. A terminal nucleotide, as 
used herein, is the nucleotide at the end position of the 3'- or 5'-terminus. 

DNA molecules are said to have "5' ends" and "3' ends" because 
mononucleotides are reacted to make oligonucleotides in a manner such that the 

25 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of 
its neighbor in one direction via a phosphodiester linkage. Therefore, an end of 
an oligonucleotides referred to as the "5' end" if its 5' phosphate is not linked to 
the 3' oxygen of a mononucleotide pentose ring and as the "3' aid" if its 3' 
oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose 

30 ring. 

As used herein, a nucleic acid sequence, even if internal to a larger 
oligonucleotide or polynucleotide, also may be said to have 5' and 3' ends. In 
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either a linear or circular DNA molecule, discrete elements are referred to as 
being "upstream" or 5' of the "downstream" or 3' elements. This terminology 
reflects the fact that transcription proceeds in a 5' to 3' fashion along the DNA 
strand. Typically, promoter and enhancer elements that direct transcription of a 
5 linked gene (e.g., open reading frame or coding region) are generally located 5' 
or upstream of the coding region. However, enhancer elements can exert their 
effect even when located 3' of the promoter element and the coding region. 
Transcription termination and polyadenylation signals are located 3' or 
downstream of the coding region. 

1 0 The term "codon" as used herein, is a basic genetic coding unit, 

consisting of a sequence of three nucleotides that specify a particular amino acid 
to be incorporation into a polypeptide chain, or a start or stop signal. The term 
"coding region" when used in reference to structural gene refers to the nucleotide 
sequences that encode the amino acids found in the nascent polypeptide as a 

1 5 result of translation of a mRNA molecule. Typically, the coding region is 
bounded on the 5' side by the nucleotide triplet "ATG" which encodes the 
initiator methionine and on the 3' side by a stop codon (e.g., TAA, TAG, TGA). 
In some cases the coding region is also known to initiate by a nucleotide triplet 
"TTG". 

20 As used herein, the terms "isolated and/or purified" refer to in vitro 

preparation, isolation and/or purification of a nucleic acid molecule, a 
polypeptide, peptide or protein, so that it is not associated with in vivo 
substances. Thus, the term "isolated" when used in relation to a nucleic acid, as 
in "isolated oligonucleotide" or "isolated polynucleotide" refers to a nucleic acid 

25 sequence that is identified and separated from at least one contaminant with 

which it is ordinarily associated in its source. An isolated nucleic acid is present 
in a form or setting that is different from that in which it is found in nature. In 
contrast, non-isolated nucleic acids (e.g., DNA and RNA) are found in the state 
they exist in nature. For example, a given DNA sequence (e.g., a gene) is found 

30 on the host cell chromosome in proximity to neighboring genes; RNA sequences 
(e.g., a specific mRNA sequence encoding a specific protein), are found in the 
cell as a mixture with numerous other mRNAs that encode a multitude of 
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proteins. Hence, with respect to an "isolated nucleic acid molecule", which 
includes a polynucleotide of genomic, cDNA, or synthetic origin or some 
combination thereof, the "isolated nucleic acid molecule" (1) is not associated 
with all or a portion of a polynucleotide in which the "isolated nucleic acid 

5 molecule" is found in nature, (2) is operably linked to a polynucleotide which it 
is not linked to in nature, or (3) does not occur in nature as part of a larger 
sequence. The isolated nucleic acid molecule may be present in single-stranded 
or double-stranded form. When a nucleic acid molecule is to be utilized to 
express a protein, the nucleic acid contains at a minimum, the sense or coding 

10 strand (i.e., the nucleic acid may be single-stranded), but may contain both the 
sense and anti-sense strands (i.e., the nucleic acid may be double-stranded). 

The term "wild-type" as used herein, refers to a gene or gene product that 
has the characteristics of that gene or gene product isolated from a naturally 
occurring source. A wild-type gene is that which is most frequently observed in 

1 5 a population and is thus arbitrarily designated the "wild-type" form of the gene. 
In contrast, the term "mutant" refers to a gene or gene product that displays 
modifications in sequence and/or functional properties (i.e., altered 
characteristics) when compared to the wild-type gene or gene product. It is 
noted that naturally-occurring mutants can be isolated; these are identified by the 

20 fact that they have altered characteristics when compared to the wild-type gene 
or gene product 

The term "recombinant DNA molecule" means a hybrid DNA sequence 
comprising at least two nucleotide sequences not normally found together in 
nature. The term "vector" is used in reference to nucleic acid molecules 

25 into which fragments of DNA may be inserted or cloned and can be used to 

transfer DNA segments) into a cell and capable of replication in a cell. Vectors 
may be derived from plasmids, bacteriophages, viruses, cosmids, and the like. 

The terms "recombinant vector", "expression vector" or "construct" as 
used herein refer to DNA or RNA sequences containing a desired coding 
30 sequence and appropriate DNA or RNA sequences necessary for the expression 
of the operably linked coding sequence in a particular host organism. 
Prokaryotic expression vectors include a promoter, a ribosome binding site, an 
origin of replication for autonomous replication in a host cell and possibly other 
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sequences, e.g. an optional operator sequence, optional restriction enzyme sites. 
A promoter is defined as a DNA sequence that directs RNA polymerase to bind 
to DNA and to initiate RNA synthesis. Eukaryotic expression vectors include a 
promoter, optionally a polyadenylation signal and optionally an enhancer 
5 sequence. 

A polynucleotide having a nucleotide sequence "encoding a peptide, 
protein or polypeptide" means a nucleic acid sequence comprising the coding 
region of a gene, or a fragment thereof which encodes a gene product having 
substantially the same activity as the corresponding full-length peptide, protein 
1 0 or polypeptide. The coding region may be present in either a cDNA, genomic 
DNA or RNA form. When present in a DNA form, the oligonucleotide may be 
single-stranded (i.e., the sense strand) or double-stranded. Suitable control 
elements such as enhancers/promoters, splice junctions, polyadenylation signals, 
etc. may be placed in close proximity to the coding region of the gene if needed 
15 to permit proper initiation of transcription and/or correct processing of the 
primary RNA transcript Alternatively, the coding region utilized in the 
expression vectors of the present invention may contain endogenous 
^ enhancers/promoters, splice junctions, intervening sequences, polyadenylation 
signals, etc. In further embodiments, the coding region may contain a 
20 combination of both endogenous and exogenous control elements. 

The term "transcription regulatory element" or "transcription regulatory 
sequence" refers to a genetic element or sequence that controls some aspect of 
the expression of nucleic acid sequence(s). For example, a promoter is a 
regulatory element that facilitates the initiation of transcription of an operably 
25 linked coding region. Other regulatory elements include, but are not limited to, 
transcription factor binding sites, splicing signals, polyadenylation signals, 
termination signals and enhancer elements. 

Transcriptional control signals in eukaryotes comprise "promoter" and 
"enhancer" elements. Promoters and enhancers consist of short arrays of DNA 
30 sequences that interact specifically with cellular proteins involved in 

transcription. Promoter and enhancer elements have been isolated from a variety 
of eukaryotic sources including genes in yeast, insect and mammalian cells. 
Promoter and enhancer elements have also been isolated from viruses and 
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analogous control elements, such as promoters, are also found in prokaryotes. 
The selection of a particular promoter and enhancer depends on the cell type 
used to express the protein of interest. Some eukaryotic promoters and 
enhancers have a broad host range while others are functional in a limited subset 
5 of cell types. For example, the SV40 early gene enhancer is very active in a 
wide variety of cell types from many mammalian species and has been widely 
used for the expression of proteins in mammalian cells. Two other examples of 
promoter/enhancer elements active in a broad range of mammalian cell types are 
those from the human elongation factor 1 gene (Uetsuki et al., 1989; Kim et al., 

10 1990; and Mizushima and Nagata, 1990) and the long terminal repeats of the 
Rous sarcoma virus (Gorman et al., 1982); and the human cytomegalovirus 
(Boshart et al., 1985). 

The term "promoter/enhancer" denotes a segment of DNA containing 
sequences capable of providing both promoter and enhancer functions (i.e., the 

15 functions provided by a promoter element and an enhancer element as described 
above). For example, the long terminal repeats of retroviruses contain both 
promoter and enhancer functions. The enhancer/promoter may be "endogenous" 
or "exogenous" or heterologous." An "endogenous" enhancer/promoter is one 
that is naturally linked with a given gene in the genome. An "exogenous" or 

20 heterologous" enhancer/promoter is one that is placed in juxtaposition to a gene 
by means of genetic manipulation (i.e., molecular biological techniques) such 
that transcription of the gene is directed by the linked enhancer/promoter. 

The presence of "splicing signals" on an expression vector often results 
in higher levels of expression of the recombinant transcript in eukaryotic host 

25 cells. Splicing signals mediate the removal of introns from the primary RNA 
transcript and consist of a splice donor and acceptor site (Sambrook et al., 1989). 
A commonly used splice donor and acceptor site is the splice junction from the 
16SRNAofSV40. 

Efficient expression of recombinant DNA sequences in eukaryotic cells 

30 requires expression of signals directing the efficient termination and 

polyadenylation of the resulting transcript. Transcription termination signals are 
generally found downstream of the polyadenylation signal and are a few hundred 
nucleotides in length. The term ,f poly(A) site" or "poly(A) sequence" as used 
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herein denotes a DNA sequence which directs both the termination and 
polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the 
recombinant transcript is desirable, as transcripts lacking a poly(A) tail are 
unstable and are rapidly degraded. The poly(A) signal utilized in an expression 
5 vector may be "heterologous" or "endogenous." An endogenous poly(A) signal 
is one that is found naturally at the 3' end of the coding region of a given gene in 
the genome. A heterologous po!y(A) signal is one which has been isolated from 
one gene and positioned 3' to another gene. A commonly used heterologous 
poly(A) signal is the SV40 poly(A) signal. The SV40 poly(A) signal is 
1 0 contained on a 237 bp BamH VBcl I restriction fragment and directs both 
termination and polyadenylation (Sambrook et al., 1989). 

Eukaryotic expression vectors may also contain "viral replicons "or "viral 
origins of replication." Viral replicons are viral DNA sequences which allow for 
the extrachromosomal replication of a vector in a host cell expressing the 
1 5 appropriate replication factors. Vectors containing either the S V40 or polyoma 
virus origin of replication replicate to high copy number (up to 10 4 copies/cell) 
in cells that express the appropriate viral T antigen. In contrast, vectors 
containing the replicons from bovine papillomavirus or Epstein-Barr virus 
replicate extrachromosomally at low copy number (about 100 copies/cell). 
20 The term "in vitro" refers to an artificial environment and to processes or 

reactions that occur within an artificial environment. In vitro environments 
include, but are not limited to, test tubes and cell lysates. The term "in situ" 
refers to cell culture. The term "in vivo" refers to the natural environment (e.g., 
an animal or a cell) and to processes or reaction that occur within a natural 
25 environment. 

The term "expression system" refers to any assay or system for 
determining (e.g., detecting) the expression of a gene of interest Those skilled 
in the field of molecular biology will understand that any of a wide variety of 
expression systems may be used. A wide range of suitable mammalian cells are 
30 available from a wide range of sources (e.g., the American Type Culture 

Collection, Rockland, MD). The method of transformation or transfection and 
the choice of expression vehicle will depend on the host system selected. 
Transformation and transfection methods are described, e.g., in Sambrook et al., 
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1989. Expression systems include in vitro gene expression assays where a gene 
of interest (e.g., a reporter gene) is linked to a regulatory sequence and the 
expression of the gene is monitored following treatment with an agent that 
inhibits or induces expression of the gene. Detection of gene expression can be 
5 through any suitable means including, but not limited to, detection of expressed 
mRNA or protein (e.g., a detectable product of a reporter gene) or through a 
detectable change in the phenotype of a cell expressing the gene of interest 
Expression systems may also comprise assays where a cleavage event or other 
nucleic acid or cellular change is detected. 

10 The term "gene" refers to a DNA sequence that comprises coding 

sequences and optionally control sequences necessary for the production of a 
polypeptide from the DNA sequence. The polypeptide can be encoded by a full- 
length coding sequence or by any portion of the coding sequence so long as the 
portion encodes a gene product with substantially the same activity as the full- 

1 5 length polypeptide. 

Nucleic acids are known to contain different types of mutations. A 
"point" mutation refers to an alteration in the sequence of a nucleotide at a single 
base position from the wild-type sequence. Mutations may also refer to insertion 
or deletion of one or more bases, so that the nucleic acid sequence differs from a 

20 reference, e.g., a wild-type, sequence. 

As used herein, the terms "hybridize" and "hybridization" refer to the 
annealing of a complementary sequence to the target nucleic acid, i.e., the ability 
of two polymers of nucleic acid (polynucleotides) containing complementary 
sequences to anneal through base pairing. The terms "annealed" and 

25 "hybridized" are used interchangeably throughout, and are intended to 

encompass any specific and reproducible interaction between a complementary 
sequence and a target nucleic acid, including binding of regions having only 
partial complementarity. Certain bases not commonly found in natural nucleic 
acids may be included in the nucleic acids of the present invention and include, 

30 for example, inosine and 7-deazaguanine. Those skilled in the art of nucleic acid 
technology can determine duplex stability empirically considering a number of 
variables including, for example, the length of the complementary sequence, 
base composition and sequence of the oligonucleotide, ionic strength and 
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incidence of mismatched base pairs. The stability of a nucleic acid duplex is 
measured by the melting temperature, or "1*"/ The T m of a particular nucleic 
acid duplex under specified conditions is the temperature at which on average 
half of the base pairs have disassociated. 
5 The term "stringency" is used in reference to the conditions of 

temperature, ionic strength, and the presence of other compounds, under which 
nucleic acid hybridizations are conducted. With "high stringency" conditions, 
nucleic acid base pairing will occur only between nucleic acid fragments that 
have a high frequency of complementary base sequences. Thus, conditions of 

1 0 "medium" or "low" stringency are often required when it is desired that nucleic 
acids which are not completely complementary to one another be hybridized or 
annealed together. The art knows well that numerous equivalent conditions can 
be employed to comprise medium or low stringency conditions. The choice of 
hybridization conditions is generally evident to one skilled in the art and is 

1 5 usually guided by the purpose of the hybridization, the type of hybridization 
(DNA-DNA or DNA-RNA), and the level of desired relatedness between the 
sequences (e.g., Sambrook et aL, 1989; Nucleic Acid Hybridization, A Practical 
Approach, IRL Press, Washington D.C., 1985, for a general discussion of the 
methods). 

20 The stability of nucleic acid duplexes is known to decrease with an 

increased number of mismatched bases, and further to be decreased to a greater 
or lesser degree depending on the relative positions of mismatches in the hybrid 
duplexes. Thus, the stringency of hybridization can be used to maximize or 
minimize stability of such duplexes. Hybridization stringency can be altered by: 

25 adjusting the temperature of hybridization; adjusting the percentage of helix 

destabilizing agents, such as formamide, in the hybridization mix; and adjusting 
the temperature and/or salt concentration of the wash solutions. For filter 
hybridizations, the final stringency of hybridizations often is determined by the 
salt concentration and/or temperature used for the post-hybridization washes. 

30 "High stringency conditions" when used in reference to nucleic acid 

hybridization include conditions equivalent to binding or hybridization at 42°C 
in a solution consisting of 5X SSPE (43.8 g/l NaCl, 6.9 g/l NaH 2 P0 4 H 2 0 and 
1 .85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's 
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reagent and 100 ng/ml denatured salmon sperm DNA followed by washing in a 
solution comprising 0.1X SSPE, 1.0% SDS at 42°G when a probe of about 500 
nucleotides in length is employed. 

"Medium stringency conditions" when used in reference to nucleic acid 
5 hybridization include conditions equivalent to binding or hybridization at 42 °C 
in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 NaH 2 P0 4 H 2 0 and 
1 .85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5X Denhardt's 
reagent and 100 \xg/ml denatured salmon sperm DNA followed by washing in a 
solution comprising 1.0X SSPE, 1.0% SDS at 42°C when a probe of about 500 

1 0 nucleotides in length is employed. 

"Low stringency conditions" include conditions equivalent to binding or 
hybridization at 42°C in a solution consisting of 5X SSPE (43.8 g/1 NaCl, 6.9 g/1 
NaH 2 P0 4 H 2 0 and 1.85 g/1 EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 
5X Denhardt's reagent [SOX Denhardt's contains per 500 ml: 5 g Ficoll (Type 

15 400, Pharmacia), 5 g BSA (Fraction V; Sigma)] and 100 g/ml denatured salmon 
sperm DNA followed by washing in a solution comprising 5X SSPE, 0.1% SDS 
at 42 °C when a probe of about 500 nucleotides in length is employed. 

By "peptide", "protein" and "polypeptide" is meant any chain of amino 
acids, regardless of length or post-translational modification (e.g., glycosylation 

20 or phosphorylation). Unless otherwise specified, the terms are interchangeable. 
The nucleic acid molecules of the invention encode a variant (mutant) of a 
naturally-occurring (wild-type) protein or fragment thereof which has 
substantially the same activity as the full length mutant protein. Preferably, such 
a mutant protein has an amino acid sequence that is at least 85%, preferably 

25 90%, and most preferably 95% or 99%, identical to the amino acid sequence of a 
corresponding wild-type protein. 

Polypeptide molecules are said to have an "amino terminus" 
(N-terminus) and a "carboxy terminus" (C-terminus) because peptide linkages 
occur between the backbone amino group of a first amino acid residue and the 

30 backbone carboxyl group of a second amino acid residue. The terms 

"N-tenninal" and "C-terminal" in reference to polypeptide sequences refer to 
regions of polypeptides including portions of the N-terminal and C-terminal 
regions of the polypeptide, respectively. A sequence that includes a portion of 
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the N-tenninal region of polypeptide includes amino acids predominantly from 
the N-terminal half of the polypeptide chain, but is not limited to such 
sequences. For example, an N-tenninal sequence may include an interior portion 
of the polypeptide sequence including bases from both the N-terminal and 
5 C-terminal halves of the polypeptide. The same applies to C-terminal regions. 
N-terminal and C-terminal regions may, but need not, include the amino acid 
defining the ultimate N-terminus and C-terminus of the polypeptide, 
respectively. 

The term "isolated" when used in relation to a polypeptide, as in "isolated 
1 0 protein" or "isolated polypeptide" refers to a polypeptide that is identified and 
separated from at least one contaminant with which it is ordinarily associated in 
its source. Thus, an isolated polypeptide (1) is not associated with proteins 
found in nature, (2) is free of other proteins from the same source, e.g., free of 
human proteins, (3) is expressed by a cell from a different species, or (4) does 
1 5 not occur in nature. In contrast, non-isolated polypeptides (e.g., proteins and 
enzymes) are found in the state they exist in nature. The terms "isolated 
polypeptide", "isolated peptide" or "isolated protein" include a polypeptide, 
peptide or protein encoded by cDNA or recombinant RNA including one of 
synthetic origin, or some combination thereof. 
20 The term "recombinant protein" or "recombinant polypeptide" as used 

herein refers to a protein molecule expressed from a recombinant DNA 
molecule. In contrast, the term "native protein" is used herein to indicate a 
protein isolated from a naturally occurring (i.e., a nonrecombinant) source. 
Molecular biological techniques may be used to produce a recombinant form of 
25 a protein with identical properties as compared to the native form of the protein. 

The term "fusion polypeptide" as used herein refers to a chimeric protein 
containing a protein of interest (e.g., luciferase, an affinity tag or a targeting 
sequence) joined to a different protein, e.g., a mutant hydrolase. 

As used herein, the term "antibody" refers to a protein having one or 
30 more polypeptides substantially encoded by immunoglobulin genes or fragments 
of immunoglobulin genes. The recognized immunoglobulin genes include the 
kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as 
well as the myriad of immunoglobulin variable region genes. Light chains are 
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classified as either kappa or lambda. Heavy chains are classified as gamma, mu, 
alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. 

The basic immunoglobulin (antibody) structural unit is known to 
5 comprise a tetramer. Each tetramer is composed of two identical pairs of 

polypeptide chains, each pair having one "light" (about 25 kD) and one "heavy" 
chain (about 50-70 kD). The N-terminus of each chain defines a variable region 
of about 100 to 1 10 or more amino acids primarily responsible for antigen 
recognition. The terms variable light chain (Vl) and variable heavy chain (V#) 

1 0 refer to these light and heavy chains respectively. 

Antibodies may exist as intact immunoglobulins, or as modifications in a 
variety of forms including, for example, FabFc2, Fab, Fv, Fd, (Fab')2, an Fv 
fragment containing only the light and heavy chain variable regions, a Fab or 
(Fab) '2 fragment containing the variable regions and parts of the constant 

15 regions, a single-chain antibody, e.g., scFv, CDR-grafted antibodies and the like. 
The heavy and light chain of a Fv may be derived from the same antibody or 
different antibodies thereby producing a chimeric Fv region. The antibody may 
be of animal (especially mouse or rat) or human origin or may be chimeric or 
humanized. As used herein the term "antibody 5 ' includes these various forms. 

20 The terms "cell," "cell line," "host cell," as used herein, are used 

interchangeably, and all such designations include progeny or potential progeny 
of these designations. By "transformed cell" is meant a cell into which (or into 
an ancestor of which) has been introduced a nucleic acid molecule of the 
invention. Optionally, a nucleic acid molecule of the invention may be 

25 introduced into a suitable cell line so as to create a stably transfected cell line 
capable of producing the protein or polypeptide encoded by the nucleic acid 
molecule. Vectors, cells, and methods for constructing such cell lines are well 
known in the art. The words "transformants" or "transformed cells" include the 
primary transformed cells derived from the originally transformed cell without 

30 regard to the number of transfers. All progeny may not be precisely identical in 
DNA content, due to deliberate or inadvertent mutations. Nonetheless, mutant 
progeny that have the same functionality as screened for in the originally 
transformed cell are included in the definition of transformants. 
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The term "homology" refers to a degree of complementarity. There may 
be partial homology or complete homology (i.e., identity). Homology is often 
measured using sequence analysis software (e.g., Sequence Analysis Software 
Package of the Genetics Computer Group. University of Wisconsin 
5 Biotechnology Center. 1710 University Avenue. Madison, WI 53705). Such 
software matches similar sequences by assigning degrees of homology to various 
substitutions, deletions, insertions, and other modifications. Conservative 
substitutions typically include substitutions within the following groups: 
glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, 
1 0 asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, 
tyrosine. 

The terai "purified" or "to purify" means the result of any process that 
removes some of a contaminant from the component of interest, such as a 
protein or nucleic acid. The percent of a purified component is thereby 
1 5 increased in the sample. 

The term "operably linked" as used herein refer to the linkage of nucleic 
acid sequences in such a manner that a nucleic acid molecule capable of 
directing the transcription of a given gene and/or the synthesis of a desired 
protein molecule is produced. The term also refers to the linkage of sequences 
20 encoding amino acids in such a manner that a functional (e.g., enzymatically 
active, capable of binding to a binding partner, capable of inhibiting, etc.) 
protein or polypeptide, or a precursor thereof, e.g., the pre- or prepro-fonn of the 
protein or polypeptide, is produced 



25 L-configuration. In keeping with standard polypeptide nomenclature, 

abbreviations for amino acid residues are as shown in the following Table of 
Correspondence. 



All amino acid residues identified herein are in the natural 



TABLE OF CORRESPONDENCE 



30 



1-Letter 



3-Letter 



AMINO ACID 



Y 



Tyr 
Gly 



L-tyrosine 

L-glycine 

L-phenylalanine 



G 



F 



Phe 
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M 


Met 


^-methionine 


A 


Ala 


L- alanine 


S 


Ser 


L-serine 


I 


lie 


L-isoleucine 


L 


Leu 


L-leucine 


T 


Thr 


L-threonine 


V 


Val 


L- valine 


P 


Pro 


L-proline 


K 


Lys 


L-lysine 


H 


His 


L-histidine 


Q 


Gin 


L-glutamine 


E 


Glu 


L-glutamic acid 


W 


Trp 


L-tryptophan 


R 


Arg 


L-arginine 


D 


Asp 


L-aspartic acid 


N 


Asn 


L-asparagjne 


C 


Cys 


L-cysteine 



As used herein, the term ,f poly-histidine tract" or (His tag) refers to a 
20 molecule comprising two to ten histidine residues, e.g., a poly-histidine tract of 
five to ten residues. A poly-histidine tract allows the affinity purification of a 
covalently linked molecule on an immobilized metal, e.g., nickel, zinc, cobalt or 
copper, chelate column or through an interaction with another molecule (e.g., an 
antibody reactive with the His tag). 
25 As used herein, "pure" means an object species is the predominant 

species present (i.e., on a molar basis it is more abundant than any other 
individual species in the composition), and preferably a substantially purified 
fraction is a composition wherein the object species comprises at least about 50 
percent (on a molar basis) of all macromolecular species present. Generally, a 
30 "substantially pure" composition will comprise more than about 80 percent of all 
macromolecular species present in the composition, more preferably more than 
about 85%, about 90%, about 95%, and about 99%. Most preferably, the object 
species is purified to essential homogeneity (contaminant species cannot be 
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detected in the composition by conventional detection methods) wherein the 
composition consists essentially of a single macromolecular species. 
I- Mutant Hydrolases and Fusions Thereof 

Mutant hydrolases within the scope of the invention include but are not 
5 limited to those prepared via recombinant techniques, e.g., site-directed 

mutagenesis or recursive mutagenesis, and comprise one or more amino acid 
substitutions which render the mutant hydrolase capable of forming a stable, e.j 
covalent, bond with a substrate, such as a substrate modified to contain one or 
more functional groups, for a corresponding nonmutant (wild-type) hydrolase. 
1 0 Hydrolases within the scope of the invention include, but are not limited to, 
peptidases, esterases (e.g., cholesterol esterase), glycosidases (e.g., 
glucosamylase), phosphatases (e.g., alkaline phosphatase) and the like. For 
instance, hydrolases include, but are not limited to, enzymes acting on ester 
bonds such as carboxylic ester hydrolases, thiolester hydrolases, phosphoric 
1 5 monoester hydrolases, phosphoric diester hydrolases, triphosphoric monoester 
hydrolases, sulfuric ester hydrolases, diphosphoric monoester hydrolases, 
phosphoric triester hydrolases, exodeoxyribonucleases producing 5'- 
phosphomonoesters, exoribonucleases producing 5'-phosphomonoesters, 
exoribonucleases producing 3'-phosphomonoesters, exonucleases active with 
20 either ribo- or deoxyribonucleic acid, exonucleases active with either ribo- or 
deoxyribonucleic acid, endodeoxyribonucleases producing 5'- 
phosphomonoesters, endodeoxyribonucleases producing other than 5'- 
phosphomonoesters, site-specific endodeoxyribonucleases specific for altered 
bases, endoribonucleases producing 5'-phosphomonoesters, endoribonucleases 
25 producing other than ^-phosphomonoesters, endoribonucleases active with 
either ribo- or deoxyribonucleic, endoribonucleases active with either ribo- or 
deoxyribonucleic glycosylases; glycosidases, e.g., enzymes hydrolyzing O- and 
S-glycosyl, and hydrolyzing N-glycosyl compounds; acting on ether bonds such 
as trialkylsulfonium hydrolases or ether hydrolases; enzymes acting on peptide 
30 bonds (peptide hydrolases) such as aminopeptidases, dipeptidases, dipeptidyl- 
peptidases and tripeptidyl-peptidases, peptidyl-dipeptidases, serine-type 
carboxypeptidases, metallocarboxypeptidases, cysteine-type carboxypeptidases, 
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omega peptidases, serine endopeptidases, cysteine endopeptidases, aspartic 
endopeptidases, metalloendopeptidases, threonine endopeptidases, and 
endopeptidases of unknown catalytic mechanism; enzymes acting on carbon- 
nitrogen bonds, other than peptide bonds, such as those in linear amides, in 

5 cyclic amides, in linear amidinesi in cyclic amidines, in nitriles, or other 

compounds; enzymes acting on acid anhydrides such as those in phosphorous- 
containing anhydrides and in sulfonyl-containing anhydrides; enzymes acting on 
acid anhydrides (catalyzing transmembrane movement); enzymes acting on acid 
anhydrides or involved in cellular and subcellular movement; enzymes acting on 

10 carbon-carbon bonds (e.g., in ketonic substances); enzymes acting on halide 
bonds (e.g., in C-halide compounds), enzymes acting on phosphorus-nitrogen 
bonds; enzymes acting on sulfur-nitrogen bonds; enzymes acting on carbon- 
phosphorus bonds; and enzymes acting on sulfur-sulfur bonds. Exemplary 
hydrolases acting on halide bonds include, but are not limited to, alkylhalidase, 

1 5 2-haloacid dehalogenase, haloacetate dehalogenase, thyroxine deiodinase, 

haloalkane dehalogenase, 4-chlorobenzoate dehalogenase, 4-chlorobenzoyl-CoA 
dehalogenase, and atrazine chlorohydrolase. Exemplary hydrolases that act on 
carbon-nitrogen bonds in cyclic amides include, but are not limited to, 
baibiturase, dihydropyrimidinase, dihydroorotase, carboxymethylhydantoinase, 

20 allantoinase, p-lactamase, imidazolonepropionase, 5-oxoprolinase (ATP- 

hydrolysing), creatininase, L-lysine-lactamase, 6-aminohexanoate-cyclic-dimer 
hydrolase, 2,5-dioxopiperazine hydrolase, iV-methylhydantoinase (ATP- 
hydrolysing), cyanuric acid amidohydrolase, maleimide hydrolase. "Beta- 
lactamase" as used herein includes Class A, Class C and Class D beta-lactamases 

25 as well as D-ala carboxypeptidase/transpeptidase, esterase EstB, penicillin 
binding protein 2X, penicillin binding protein 5, and D-amino peptidase. 
Preferably, the beta-lactamase is a serine beta-lactamase, e.g., one having a 
catalytic serine residue at a position corresponding to residue 70 in the serine 
beta-lactamase of S. aureus PCI, and a glutamic acid residue at a position 

30 corresponding to residue 166 in the serine beta-lactamase of 5. aureus PCI , 

optionally having a lysine residue at a position corresponding to residue 73, and 
also optionally having a lysine residue at a position corresponding to residue 
234, in the beta-lactamase of 5. aureus PCI . 
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In one embodiment, the mutant hydrolase is a haloalkane dehalogenase, 
e.g., such as those found in Gram-negative (Keuiring et al., 1985) and Gram- 
positive haloalkane-utilizing bacteria (Keuning et al., 1985; Yokota et al., 1987; 
Scholtz et al., 1987; Sallis et al., 1990). Haloalkane dehalogenases, including 
5 DhlA from Xanthobacter autotrophics GJ10 (Janssen et al., 1988, 1989) and 
DhaA from Rhodococcus rhodochrous, are enzymes which catalyze hydrolytic 
dehalogenation of corresponding hydrocarbons. Halogenated aliphatic 
hydrocarbons subject to conversion include C 2 -Cio saturated aliphatic 
hydrocarbons which have one or more halogen groups attached, wherein at least 
10 two of the halogens are on adjacent carbon atoms. Such aliphatic hydrocarbons 
include volatile chlorinated aliphatic (VCA) hydrocarbons. VCA f s include, for 
example, aliphatic hydrocarbons such as dichloroethane, 1,2-dichloro-propane, 
1,2-dichlorobutaneand 1,2,3-trichloropropane. The term "halogenated 
hydrocarbon" as used herein means a halogenated aliphatic hydrocarbon. As 
15 used herein the term "halogen" includes chlorine, bromine, iodine, fluorine, 
astatine and the like. A preferred halogen is chlorine. 

As described herein, the invention includes a fusion protein comprising a 
mutant hydrolase and amino acid sequences for a protein of interest, e.g., 
sequences for a marker protein or affinity tag, e.g., luciferase, GFP, or a 
20 polyhistidine sequence, a nucleic acid binding protein, an extracellular matrix 
protein, a secreted protein, a receptor ligand, a serum protein, an immunogenic 
protein, a fluorescent protein, a protein with reactive cysteines, a receptor 
protein, e.g., NMDA receptor, a channel protein, e.g., a sodium-, potassium- or a 
calcium-sensitive channel protein including a HERG channel protein, or a 
25 transporter protein, e.g., EAAT1-4 glutamate transporter, as well as targeting 
signals, e.g., a plastid targeting signal, a nuclear localization signal or a 
myristilation sequence. 

D- Optimized Hydrolase Sequences, and Vectors and Host Cells Encoding 
the Hydrolase 

30 A nucleic acid molecule comprising a nucleic acid sequence encoding a 

hydrolase or a fusion thereof is optionally optimized for expression in a 
particular host cell and also optionally operably linked to transcription regulatory 
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sequences, e.g., one or more enhancers, a promoter, a transcription termination 
sequence or a combination thereof, to form an expression cassette. 

In one embodiment, a nucleic acid sequence encoding a hydrolase or a 
fusion thereof is optimized by replacing codons in a wild-type or mutant 
5 hydrolase sequence with codons which are preferentially employed in a 

particular (selected) cell. Preferred codons have a relatively high codon usage 
frequency in a selected cell, and preferably their introduction results in the 
introduction of relatively few transcription factor binding sites for transcription 
factors present in the selected host cell, and relatively few other undesirable 

1 0 structural attributes. Thus, the optimized nucleic acid product has an improved 
level of expression due to improved codon usage frequency, and a reduced risk 
of inappropriate transcriptional behavior due to a reduced number of undesirable 
transcription regulatory sequences. 

An isolated and optimized nucleic acid molecule of the invention may 

1 5 have a codon composition that differs from that of the corresponding wild-type 
nucleic acid sequence at more than 30%, 35%, 40% or more than 45%, e.g., 
50%, 55%, 60% or more of the codons. Preferred codons for use in the 
invention are those which are employed more frequently than at least one other 
codon for the same amino acid in a particular organism and, more preferably, are 

20 also not low-usage codons in that organism and are not low-usage codons in the 
organism used to clone or screen for the expression of the nucleic acid molecule. 
Moreover, preferred codons for certain amino acids (i.e., those amino acids that 
have three or more codons), may include two or more codons that are employed 
more frequently than the other (non-preferred) codon(s). The presence of 

25 codons in the nucleic acid molecule that are employed more frequently in one 
organism than in another organism results in a nucleic acid molecule which, 
when introduced into the cells of the organism that employs those codons more 
frequently, is expressed in those cells at a level that is greater than the expression 
of the wild-type or parent nucleic acid sequence in those cells. 

30 In one embodiment of the invention, the codons that are different are 

those employed more frequently in a mammal, while in another embodiment the 
codons that are different are those employed more frequently in a plant. 
Preferred codons for different organisms are known to the art, e.g., see 
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wwwJcazu sa.or.ip./codon/ . A particular type of mammal, e.g., a human, may 
have a different set of preferred codons than another type of mammal. Likewise, 
a particular type of plant may have a different set of preferred codons than 
another type of plant In one embodiment of the invention, the majority of the 
5 codons that differ are ones that are preferred codons in a desired host cell. 

Preferred codons for organisms including mammals (e.g., humans) and plants are 
known to the art (e.g., Wada et al., 1990; Ausubel et al., 1997). For example, 
preferred human codons include, but are not limited to, CGC (Arg), CTG (Leu), 
TCT (Ser), AGC (Ser), ACC (Thr), CCA (Pro), CCT (Pro), GCC (Ala), GGC 
10 (Gly), GTG (Val), ATC (He), ATT (He), AAG (Lys), AAC (Asn), CAG (Gin), 
CAC (His), GAG (Glu), GAC (Asp), TAC (Tyr), TGC (Cys) and TTC (Phe) 
(Wada et al., 1990). Thus, in one embodiment, synthetic nucleic acid molecules 
of the invention have a codon composition which differs from a wild type 
nucleic acid sequence by having an increased number of the preferred human 
15 codons, e.g., CGC, CTG, TCT, AGC, ACC, CCA, CCT, GCC, GGC, GTG, 
ATC, ATT, AAG, AAC, CAG, CAC, GAG, GAC, TAC, TGC, TTC, or any 
combination thereof. For example, the nucleic acid molecule of the invention 
may have an increased number of CTG or TTG leucine-encoding codons, GTG 
or GTC valine-encoding codons, GGC or GGT glycine-encoding codons, ATC 
20 or ATT isoleucine-encoding codons, CCA or CCT proline-encoding codons, 
CGC or CGT arginine-encoding codons, AGC or TCT serine-encoding codons, 
ACC or ACT threonine-encoding codon, GCC or GCT alanine-encoding codons, 
or any combination thereof, relative to the wild-type nucleic acid sequence. In 
another embodiment, preferred C. elegans codons include, but are not limited, to 
25 UUC (Phe), UUU (Phe), CUU (Leu), UUG (Leu), AUU (lie), GUU (Val), GUG 
(Val), UCA (Ser), UCU (Ser), CCA (Pro), ACA (Thr), ACU (Thr), GCU (Ala), 
GCA (Ala), UAU (Tyr), CAU (His), CAA (Gin), AAU (Asn), AAA (Lys), GAU 
(Asp), GAA (Glu), UGU (Cys), AGA (Arg), CGA (Arg), CGU (Arg), GGA 
(Gly), or any combination thereof. In yet another embodiment, preferred 
30 Drosophilia codons include, but are not limited to, UUC (Phe), CUG (Leu), 
CUC (Leu), AUC (lie), AUU (lie), GUG (Val), GUC (Val), AGC (Ser), UCC 
(Ser), CCC (Pro), CCG (Pro), ACC (Thr), ACG (Thr), GCC (Ala), GCU (Ala), 
UAC (Tyr), CAC (His), CAG (Gin), AAC (Asn), AAG (Lys), GAU (Asp), GAG 
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(Glu), UGC (Cys), CGC (Arg), GGC (Gly), GGA (gly), or any combination 
thereof. Preferred yeast eodons include but are not limited to DUU (Phe), UUG 
(Leu), UUA (Leu), CCU (Leu), AUU (He), GUU (Val), UCU (Ser), UCA (Ser), 
CCA (Pro), CCU (Pro), ACU (Thr), ACA (Thr), GCU (Ala), GCA (Ala), UAU 

5 (Tyr), UAC (Tyr), CAU (ffis), CAA (Gin), AAU (Asn), AAC (Asn), AAA 
(Lys), AAG (Lys), GAU (Asp), GAA (Glu), GAG (Glu), UGU (Cys), CGU 
(Trp), AGA (Arg), CGU (Arg), GGU (Gly), GGA (Gly), or any combination 
thereof. Similarly, nucleic acid molecules having an increased number of 
codons that are employed more frequently in plants, have a codon composition 

10 which differs from a wild-type or parent nucleic acid sequence by having an 
increased number of the plant codons including, but not limited to, CGC (Arg), 
CIT (Leu), TCT (Ser), TCC (Ser), ACC (Thr), CCA (Pro), CCT (Pro), GCT 
(Ser), GGA (Gly), GTG (Val), ATC (lie), ATT (lie), AAG (Lys), AAC (Asn), 
CAA (Gin), CAC (ffis), GAG (Glu), GAC (Asp), TAC (Tyr), TGC (Cys), TTC 

1 5 (Phe), or any combination thereof (Murray et al., 1 989). Preferred codons may 
differ for different types of plants (Wada et al., 1 990). 

In one embodiment, an optimized nucleic acid sequence encoding a 
hydrolase or fusion thereof has less than 100%, e.g., less than 90% or less than 
80%, nucleic acid sequence identity relative to a non-optimized nucleic acid 

20 sequence encoding a corresponding hydrolase or fusion thereof. For instance, 
an optimized nucleic acid sequence encoding DhaA has less than about 80% 
nucleic acid sequence identity relative to non-optimized (wild-type) nucleic acid 
sequence encoding a corresponding DhaA, and the DhaA encoded by the 
optimized nucleic acid sequence optionally has at least 85% amino acid 

25 sequence identity to a corresponding wild-type DhaA. In one embodiment, the 
activity of a DhaA encoded by the optimized nucleic acid sequence is at least 
10%, e.g. 5 50% or more, of the activity of a DhaA encoded by the non-optimized 
sequence, e.g., a mutant DhaA encoded by the optimized nucleic acid sequence 
binds a substrate with substantially the same efficiency, i.e., at least 50%, 80%, 

30 1 00% or more, as the mutant DhaA encoded by the non-optimized nucleic acid 
sequence binds the same substrate. 

The nucleic acid molecule or expression cassette may be introduced to a 
vector, e.g., a plasmid or viral vector, which optionally includes a selectable 
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marker gene, and the vector introduced to a cell of interest, for example, a 
prokaryotic cell suchasRcoli.Streptomyces spp., Bacillus spp., Staphylococcus 
spp. and the like, as well as eukaryotic cells including a plant (dicot or monocot), 
fungus, yeast, e.g., Pichia, Saccharomyces or Schizosaccharomyces, or 
5 mammalian cell. Preferred mammalian cells include bovine, caprine, ovine, 
canine, feline, non-human primate, e.g., simian, and human cells. Preferred 
mammalian cell lines include, but are not limited to, CHO, COS, 293, Hela, CV- 
1, SH-SY5Y (human neuroblastoma cells), HEK293, and NDH3T3 cells. 

The expression of the encoded mutant hydrolase may be controlled by 
1 0 any promoter capable of expression in prokaryotic cells or eukaryotic cells. 

Preferred prokaryotic promoters include, but are not limited to, SP6, T7, T5, tac, 
Ma> t*P, gal, lac or maltose promoters. Preferred eukaryotic promoters include, 
but are not limited to, constitutive promoters, e.g., viral promoters such as CMV, 
SV40 and RSV promoters, as well as regulatable promoters, e.g., an inducible or 
1 5 repressible promoter such as the tet promoter, the hsp70 promoter and a 

synthetic promoter regulated by CRE. Preferred vectors for bacterial expression 
include pGEX-5X-3, and for eukaryotic expression include pCIneo-CMV. 

The nucleic acid molecule, expression cassette and/or vector of the 
invention may be introduced to a cell by any method including, but not limited 
20 to, calcium-mediated transformation, electroporation, microinjection, 
lipofection, particle bombardment and the like. 
HI. Functional Group s 

Functional groups useful in the substrates and methods of the invention 
are molecules that are detectable or capable of detection. A functional group 
25 within the scope of the invention is capable of being covalently linked to one 
reactive substituent of a bifunctional linker or a substrate for a hydrolase, and, as 
part of a substrate of the invention, has substantially the same activity as a 
functional group which is not linked to a substrate found in nature and is capable 
of forming a stable complex with a mutant hydrolase. Functional groups thus 
30 have one or more properties that facilitate detection, and optionally the isolation, 
of stable complexes between a substrate having that functional group and a 
mutant hydrolase. For instance, functional groups include those with a 
characteristic electromagnetic spectral property such as emission or absorbance, 
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magnetism, electron spin resonance, electrical capacitance, dielectric constant or 
electrical conductivity as well as functional groups which are ferromagnetic, 
paramagnetic, diamagnetic, luminescent, electro chemiluminescent, fluorescent, 
phosphorescent, chromatic, antigenic, or have a distinctive mass. A functional 
5 group includes, but is not limited to, a nucleic acid molecule, i.e., DNA or RNA, 
e.g., an oligonucleotide or nucleotide, a protein, e.g., a luminescent protein, a 
peptide, for instance, an epitope recognized by a ligand, e.g., biotin or y 
streptavidin, a hapten, an amino acid, a lipid, a lipid bilayer, a solid support, a 
fluorophore, a chromophore, a reporter molecule, a radionuclide, an electron 

1 0 opaque molecule, a MRI contrast agent, e.g., manganese, gadolinium (III) or 
iron-oxide particles, and the like. Methods to detect a particular functional 
group are known to the art. For example, a nucleic acid molecule can be 
detected by hybridization, amplification, binding to a nucleic acid binding 
protein specific for the nucleic acid molecule, enzymatic assays (e.g., if the 

1 5 nucleic acid molecule is a ribozyme), or, if the nucleic acid molecule itself 

comprises a molecule which is detectable or capable of detection, for instance, a 
radiolabel or biotin, it can be detected by an aissay suitable for that molecule. 

Exemplary functional groups include haptens, e.g., molecules useful to 
enhance immunogenicity such as keyhole limpet hemacyanin (KLH), cleavable 

20 labels, for instance, photocleavable biotin, and fluorescent labels, e.g., N- 
hydroxysuccinimide (NHS) modified coumarin and succinimide or 
sulfonosuccinimide modified BODIPY (which can be detected by UV and/or 
visible excited fluorescence detection), rhodamine, e.g., Rl 10, rhodols, CRG6, 
Texas Methyl Red (TAMRA), Rox5, FAM, or fluorescein, coumarin derivatives, 

25 e.g., 7 aminocoumarin, and 7-hydroxycoumarin, 2-amino-4-methoxynapthalene, 
1-hydroxypyrene, resorufin, phenalenones or benzphenalenones (U.S. Patent No. 
4,812,409), acridinones (U.S. Patent No. 4,810,636), anthracenes, and 
derivatives of a- and p-napthol, fluorinated xanthene derivatives including 
fluorinated fluoresceins and rhodols (e.g., U.S. Patent No. 6,162,931), and 

30 bioluminescent molecules, e.g., luciferase or GFP. A fluorescent (or 

bioluminescent) functional group linked to a mutant hydrolase by virtue of being 
linked to a substrate for a corresponding wild-type hydrolase, may be used to 
sense changes in a system, like phosphorylation, in real time. Moreover, a 
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fluorescent molecule, such as a chemosensor of metal ions, e.g., a 9- 
carbonylanthracene modified glycyl-histidyl-lysine (GHK) for Cu 2+ , in a 
substrate of the invention may be employed to label proteins which bind the 
substrate. A bioluminescent or fluorescent functional group such as BODIP Y, 
5 rhodamine green, GFP, or infrared dyes, also finds use as a functional group and 
may, for instance, be employed in interaction studies, e.g., using BRET, FRET, 
LRET or electrophoresis. 

Another class of functional group is a molecule that selectively interacts 
with molecules containing acceptor groups (an "affinity" molecule). Thus, a 
1 0 substrate for a hydrolase which includes an affinity molecule can facilitate the 
separation of complexes having such a substrate and a mutant hydrolase, because 
of the selective interaction of the affinity molecule with another molecule, e.g., 
an acceptor molecule, that may be biological or non-biological in origin. For 
example, the specific molecule with which the affinity molecule interacts 
1 5 (referred to as the acceptor molecule) could be a small organic molecule, a 

chemical group such as a sulfhydryl group (-SH) or a large biomolecule such as 
an antibody or other naturally occurring ligand for the affinity molecule. The 
binding is normally chemical in nature and may involve the formation of 
covalent or non-covalent bonds or interactions such as ionic or hydrogen 
20 bonding. The acceptor molecule might be free in solution or itself bound to a 
solid or semi-solid surface, a polymer matrix, or reside on the surface of a solid 
or semi-solid substrate. The interaction may also be triggered by an external 
agent such as light, temperature, pressure or the addition of a chemical or 
biological molecule that acts as a catalyst. The detection and/or separation of the 
25 complex from the reaction mixture occurs because of the interaction, normally a 
type of binding, between the affinity molecule and the acceptor molecule. 

Examples of affinity molecules include molecules such as immunogenic 
molecules, e.g., epitopes of proteins, peptides, carbohydrates or lipids, i.e., any 
molecule which is useful to prepare antibodies specific for that molecule; biotin, 
30 avidin, streptavidin, and derivatives thereof; metal binding molecules; and 

fragments and combinations of these molecules. Exemplary affinity molecules 
include His5 (HHHHH) (SEQ ID NO:19), HisX6 (HHHHHH) (SEQ ID NO:20), 
C-myc (EQKLISEEDL) (SEQ IDNO:21), Flag (DYKDDDDK) (SEQ ID 
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NO:22), SteptTag (WSHPQFEK) (SEQ ID NO:23), HA Tag (YPYDVPDYA) 
(SEQ ID NO:24), thioredoxin, cellulose binding domain, chitin binding domain, 
S-peptide, T7 peptide, calmodulin binding peptide, C-end RNA tag, metal 
binding domains, metal binding reactive groups, amino acid reactive groups, 
5 inteins, biotin, streptavidin, and maltose binding protein. For example, a 
substrate for a hydrolase which includes biotin is contacted with a mutant 
hydrolase. The presence of the biotin in a complex between the mutant 
hydrolase and the substrate permits selective binding of the complex to avidin 
molecules, e.g., streptavidin molecules coated onto a surface, e.g., beads, 

10 micro wells, nitrocellulose and the like. Suitable surfaces include resins for 
chromatographic separation, plastics such as tissue culture surfaces or binding 
plates, microtiter dishes and beads, ceramics and glasses, particles including 
magnetic particles, polymers and other matrices. The treated surface is washed 
with, for example, phosphate buffered saline (PBS), to remove molecules that 

1 5 lack biotin and the biotin-containing complexes isolated. In some case these 
materials may be part of biomolecular sensing devices such as optical fibers, 
chemfets, and plasmon detectors. 

Another example of an affinity molecule is dansyllysine. Antibodies 
which interact with the dansyl ring are commercially available (Sigma Chemical; 

20 St Louis, MO) or can be prepared using known protocols such as described in 

Antibodies: A Laboratory Manual (Harlow and Lane, 1988). For example, the 

♦ 

anti-dansyl antibody is immobilized onto the packing material of a 
chromatographic column. This method, affinity column chromatography, 
accomplishes separation by causing the complex between a mutant hydrolase 

25 and a substrate of the invention to be retained on the column due to its 

interaction with the immobilized antibody, while other molecules pass through 
the column. The complex may then be released by disrupting the antibody- 
antigen interaction. Specific chromatographic column materials such as ion- 
exchange or affinity Sepharose, Sephacryl, Sephadex and other chromatography 

30 resins are commercially available (Sigma Chemical; St. Louis, MO; Pharmacia 
Biotech; Piscataway, N. J.). Dansyllysine may conveniently be detected because 
of its fluorescent properties. 
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When employing an antibody as an acceptor molecule, separation can 
also be performed through other biochemical separation methods such as 
immunoprecipitation and immobilization of antibodies on filters or other 
surfaces such as beads, plates or resins. For example, complexes of a mutant 
5 hydrolase and a substrate of the invention may be isolated by coating magnetic 
beads with an affinity molecule-specific or a hydrolase-specific antibody. Beads 
are oftentimes separated from the mixture using magnetic fields. 

Another class of functional molecules includes molecules detectable 
using electromagnetic radiation and includes but is not limited to xanthene 

10 fluorophores, dansyl fluorophores, coumarins and coumarin derivatives, 

fluorescent acridinium moieties, benzopyrene based fluorophores, as well as 7- 
nitrobenz-2-oxa- 1 ,3 -diazole, and 3-N-(7-nitrobenz-2-oxa-l,3-diazol-4-yl)-2,3- 
diamino-propionic acid. Preferably, the fluorescent molecule has a high 
quantum yield of fluorescence at a wavelength different from native amino acids 

15 and more preferably has high quantum yield of fluorescence that can be excited 
in the visible, or in both the UV and visible, portion of the spectrum. Upon 
excitation at a preselected wavelength, the molecule is detectable at low 
concentrations either visually or using conventional fluorescence detection 
methods. Electrochemiluminescent molecules such as ruthenium chelates and its 

20 derivatives or nitroxide amino acids and their derivatives are detectable at 
femtomolar ranges and below. 

In addition to fluorescent molecules, a variety of molecules with physical 
properties based on the interaction and response of the molecule to 
electromagnetic fields and radiation can be used to detect complexes between a 

25 mutant hydrolase and a substrate of the invention. These properties include 
absorption in the UV, visible and infrared regions of the electromagnetic 
spectrum, presence of chromophores which are Raman active, and can be further 
enhanced by resonance Raman spectroscopy, electron spin resonance activity 
and nuclear magnetic resonances and molecular mass, e.g., via a mass 

30 spectrometer. 

Methods to detect and/or isolate complexes having affinity molecules 
include chromatographic techniques including gel filtration, fast-pressure or 
high-pressure liquid chromatography, reverse-phase chromatography, affinity 
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chromatography and ion exchange chromatography. Other methods of protein 
separation are also useful for detection and subsequent isolation of complexes 
between a mutant hydrolase and a substrate of the invention, for example, 
electrophoresis, isoelectric focusing and mass spectrometry. 

5 IV. Linkers 

The term "linker", which is also identified by the symbol 'L', refers to a 
group or groups that covalently attach one or more functional groups to a 
substrate which includes a reactive group or to a reactive group. A linker, as 
used herein, is not a single covalent bond. The structure of the linker is not 

10 crucial, provided it yields a substrate that can be bound by its target enzyme. In 
one embodiment, the linker can be a divalent group that separates a functional 
group (R) and the reactive group by about 5 angstroms to about 1000 angstroms, 
inclusive, in length. Other suitable linkers include linkers that separate R and the 
reactive group by about 5 angstroms to about 100 angstroms, as well as linkers 

1 5 that separate R and the substrate by about 5 angstroms to about 50 angstroms, by 
about 5 angstroms to about 25 angstroms, by about 5 angstroms to about 500 
angstroms, or by about 30 angstroms to about 100 angstroms. 
In one embodiment the linker is an amino acid. 
In another embodiment, the linker is a peptide. 

20 In another embodiment, the linker is a divalent branched or unbranched 

carbon chain comprising from about 2 to about 30 carbon atoms, which chain 
optionally includes one or more (e.g., 1, 2, 3, or 4) double or triple bonds, and 
which chain is optionally substituted with one or more (e.g., 2, 3, or 4) hydroxy 
or oxo (=0) groups, wherein one or more (e.g., 1, 2, 3, or 4) of the carbon atoms 

25 in the chain is optionally replaced with a non-peroxide -O-, -S- or -NH-. 

In another embodiment, the linker is a divalent group of the formula -W- 
F-W- wherein F is (CrC 3 o)alkyl,(C2-C 3 o)alkenyl, (C 2 -C 30 )alkynyl, (C 3 - 
C 8 )cycloalkyl, or (C 6 -C 10 )aryl, wherein W is -N(Q)C(=0)-, -C(=0)N(Q)-, 
-OC(=0)-, -C(=0)0-, -O-, -S-, -S(O)-, -S(0) 2 -, -N(Q)-, -C(=OK or a direct 

30 bond; wherein each Q is independently H or (Q-C6)alkyl 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 30 carbon atoms, which chain 
optionally includes one or more (e.g., 1 , 2, 3, or 4) double or triple bonds, and 
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which chain is optionally substituted with one or more (e.g., 2, 3, or 4) hydroxy 
or oxo (=0) groups. 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 30 carbon atoms, which chain 
5 optionally includes one or more (e.g., 1 , 2, 3, or 4) double or triple bonds. 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 30 carbon atoms. 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 20 carbon atoms, which chain 
1 0 optionally includes one or more (e.g., 1 , 2, 3, or 4) double or triple bonds, and 
which chain is optionally substituted with one or more (e.g., 2, 3, or 4) hydroxy 
or oxo (=0) groups. 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 20 carbon atoms, which chain 
1 5 optionally includes one or more (e.g., 1 , 2, 3, or 4) double or triple bonds. 

In another embodiment, the linker is a divalent branched or unbranched 
carbon chain comprising from about 2 to about 20 carbon atoms. 

In another embodiment, the linker is ^CH 2 CH 2 0)-i-io. 

In another embodiment, the linker is -C(=0)NH(CH 2 )3-; 
20 -C(=0)NH(CH 2 ) 5 C(^ 

.C(=0)NH(CH 2 )20(CH 2 ) 2 0(CH 2 )3S ^CHzO^^NH^H^O^H^OfCH^s 

-(CH 2 )4C(=0)^^ 

-C(=0)NH(C^ 

Specifically, (Ci-C 30 )alkyl can be methyl, ethyl, propyl, isopropyl, butyl, 
25 iso-butyl, sec-butyl, pentyl, 3-pentyl, hexyl, heptyl, octyl, nonyl, or decyl; (C 3 - 
C 8 )cycloalkyl can be cyclopropyl, cyclobutyl, cyclopentyl, or cyclohexyl;(C 2 - 
C 30 )alkenyl can be vinyl, allyl, 1-propenyl, 2-propenyl, 1-butenyl, 2-butenyl, 3- 
butenyl, 1,-pentenyl, 2-pentenyl, 3-pentenyl, 4-pentenyl, 1- hexenyl, 2-hexenyl, 
3-hexenyl, 4-hexenyl, 5-hexenyl, heptenyl, octenyl, nonenyl, or decenyl; (C2- 
30 C 30 )alkynyl can be ethynyl, 1-propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 3- 
butynyl, 1-pentyny], 2-pentynyl, 3-pentynyl, 4-pentynyl, 1- hexynyl, 2-hexynyl, 
3-hexynyl, 4-hexynyl, 5-hexynyl, heptynyl, octynyl, nonynyl, or decynyl; and 
(C 6 -Cio)aryl can be phenyl, indenyl, or naphthyl. 
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The term "amino acid," when used with reference to a linker, comprises 
the residues of the natural amino acids (e.g., Ala, Arg, Asn, Asp, Cys, Glu, Gin, 
Gly, His, Hyl, Hyp, He, Leu, Lys, Met, Phe, Pro, Ser, Thr, Tip, Tyr, and Val) in 
D or L form, as well as unnatural amino acids (e.g., phosphoserine, 

5 phosphothreonine, phosphotyrosine, hydroxyproline, gamma-carboxyglutamate; 
hippuric acid, octahydroindole-2-carboxylic acid, statine, 
l,2,3,4 r teti^ydroisoquinoline-3-carboxylic acid, penicillamine, ornithine, 
citruline, a-methyl-alanine, para-benzoylphenylalanine, phenylglycine, 
propargylglycine, sarcosine, and tert-butylglycine). The term also includes 

10 natural and unnatural amino acids bearing a conventional amino protecting 

group (e.g., acetyl or benzyloxycarbonyl), as well as natural and unnatural amino 
acids protected at the carboxy terminus (e.g. as a (Ci-C^alkyl, phenyl or benzyl 
ester or amide). Other suitable amino and carboxy protecting groups are known 
to those skilled in the art (see for example, Greene, Protecting Groups In 

1 5 Organic Synthesis; Wiley: New York, 1981, and references cited therein). An 
amino acid can be linked to another molecule through the carboxy terminus, the 
amino terminus, or through any other convenient point of attachment, such as, 
for example, through the sulfur of cysteine. 

The term "peptide" when used with reference to a linker, describes a 

20 sequence of 2 to 25 amino acids (e.g. as defined hereinabove) or peptidyl 

residues. The sequence may be linear or cyclic. For example, a cyclic peptide 
can be prepared or may result from the formation of disulfide bridges between 
two cysteine residues in a sequence. A peptide can be linked to another 
molecule through the carboxy terminus, the amino terminus, or through any 

25 other convenient point of attachment, such as, for example, through the sulfur of 
a cysteine. Preferably a peptide comprises 3 to 25, or 5 to 21 amino acids. 
Peptide derivatives can be prepared as disclosed in U.S. Patent Numbers 
4,612,302; 4,853,371 ; and 4,684,620. Peptide sequences specifically recited 
herein are written with the amino terminus on the left and the carboxy terminus 

30 on the right. 

In one embodiment, a substrate of the invention for a dehalogenase 
which has a linker has the formula (I): 

R— linker-A-X (I) 
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wherein R is one or more functional groups (such as a fluorophore, biotin, 
luminophore, or a fluorogenic or luminogenic molecule, or is a solid support, 
including microspheres, membranes, glass beads, and the like), wherein the 
linker is 

5 a multiatom straight or branched chain including C, N, S, or O, wherein A-X is a 
substrate for a dehalogenase, and wherein X is a halogen. In one embodiment, 
A-X is a haloaliphatic or haloaromatic substrate for a dehalogenase. In one 
embodiment, the linker is a divalent branched or unbranched carbon chain 
comprising from about 12 to about 30 carbon atoms, which chain optionally 

10 includes one or more (e.g., 1, 2, 3, or 4) double or triple bonds, and which chain 
is optionally substituted with one or more (e.g., 2, 3, or 4) hydroxy or oxo (=0) 
groups, wherein one or more (e.g., 1, 2, 3, or 4) of the carbon atoms in the chain 
is optionally replaced with a non-peroxide -O-, -S- or -NH-. In one embodiment, 
A is CH2CH2 or CH2CH2CH2. In one embodiment, a linker in a substrate for a 

1 5 dehalogenase such as a Rhodococcus dehalogenase, is a multiatom straight or 
branched chain including C, N, S, or O, and preferably 1 1-30 atoms when the 
functional group R includes an aromatic ring system or is a solid support. 

In another embodiment, a substrate of the invention for a dehalogenase 
which has a linker has formula (II): 



20 



R— Unker-CH 2 -CH 2 -CH 2 -X (II) 



where X is a halogen, preferably chloride. In one embodiment, R is one or more 
functional groups, such as a fluorophore, biotin, luminophore, or a fluorogenic or 
25 luminogenic molecule, or is a solid support, including microspheres, 

membranes, glass beads, and the like. When R is a radiolabel, or a small 
detectable atom such as a spectroscopically active isotope, the linker can be 0-30 
atoms. 



30 V. Syntheses for Exemplary Substrates 

[2-(2-Hydroxy-ethoxy)-ethyl]-carbamic acid anthracen-9-yImethyl ester. To 

a stirring slurry of 9-anthracenemethanol (10 g, 48 romol) and 4-nitrophenyl 
chloroformate (13.6 g, 67.5 mmol) in 200 ml CH2CI2 was added triethylamine 
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(6.7 ml, 0.19 mol). The resulting gold colored solution was allowed to stir 16 
hrs at room temperature. At this point, 2-(2-aminoetiioxy)ethanol (14.4 ml, 
0.144 mol) was added and stirring continued for another 24 hours. The CH2CI2 
reaction mixture was then washed with a 2% sodium hydroxide (w/w) solution 

5 until no j^-mtrophenol was observed in the organic layer. The dichloromethane 
was dried with sodium sulfate, filtered, and evaporated under reduced pressure. 

The crude product was further purified by column chromatography on 
silica gel 60, progressively eluting with 1% to 3% methanol in dichloromethane. 
7.6 g (58% yield) of a yellow solid was isolated: *H NMR (CDC1 3 ) 8 8.38 (s, H- 

10 10), 8.28 (d, H-l, 8), 7.94 (d, H-4, 5), 7.44 (m, H-2, 3, 6, 7), 6.06 (s, Clfc-anth), 
5.47 (t, exchangeable, NH), 3.53 (bs, CH2-OH) 3.33 (m, three -CH2-). Mass 
spectrum, m/e Calcd for C20H22NO4": 340.15. Found: 340.23. Calcdfor 
C2oH2iNNa0 4 + : 340.15. Found: 340.23. 




15 a compound of formula III OH 

{2-[2-(6-CUoro-hexyloxy)-ethoxy]-ethyl}-carbaniic acid anthracen-9- 
ylmethyl ester. A 100 ml round bottom flask was charged with [2-(2-Hydroxy- 

20 ethoxy)-ethyl]-carbamic acid anthracen-9-ylmethyl ester (1.12 g, 3mmol) and 
fresh sodium hydride, 60% dispersion in mineral oil (360 mg, 9mmol) under 
inert atmosphere. 20 ml anhydrous THF was added and the reaction allowed to 
stir for 30 minutes. The flask is then cooled to between -10 and -20°C by 
means of an ice/NaCl bath. When the temperature is reached l-chloro-6- 

25 Iodohexane (1 ml, 6 mmol) is added via syringe. The reaction is maintained at 
ice/NaCl temperature for 2 hours, then slowly allowed to warm to room 
temperature overnight. At this point silica gel 60 is co-absorbed onto the 
reaction mixture with loss of solvent under reduced pressure. Silica gel 
chromatography takes place initially with heptane as eluent, followed by 10%, 
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20% and 25% ethyl acetate. A total of 0.57 g (41% yield) of product is isolated 
from appropriate fractions: *H NMR (CDC1 3 ) 5 8.48 (s, H-10), 8.38 (d, H-l, 8), 
8.01 (d, H-4, 5), 7.52 (dt, H-2, 3, 6, 7), 6.13 (s, CEb-anth), 5.29 (bs, 
exchangeable, NH), 3.74 (m, 4H), 3.55-3.15 (m, 8H), 1.84 (m, 4H), 1.61 (m, 
5 1H), 1.43 (m, 1H), 1.25 (m, 2H). Mass spectrum, m/e Calcd for 

C26H32CINO4H2O: 475.21(100%), 476.22(29.6%). Found: 475.21, 476.52. 



2-[2-(6-chlorohexyloxy)-ethoxy]-ethyl-ainmomum trifluoro-acetate. To {2- 

[2-(6-Chloro-hexyloxy>ethoxy]-ethyl}-carbamic acid anthracen-9-ylmethyl 
ester (0.56 g, 1.2 mmol) dissolved in 4 ml dichloromethane was added 2 drops of 

15 anisole. The reaction mixture is cooled by means of an ice/NaCl bath. After 10 
minutes trifluoroacetic acid (2 ml) is added. The reaction mixture turns dark 
brown upon addition and is allowed to stir for 30 minutes. All volatiles are 
removed under reduced atmosphere. The residue is re-dissolved in CH 2 C1 2 and 
washed twice with water. The aqueous fractions are frozen and lyophilized 

20 overnight. An oily residue remains and is dissolved in anhydrous DMF to be 
used as a stock solution in further reactions. Mass spectrum, m/e Calcd for 
C 10 H23ClNO 2 + : 224.14(100%), 226.14(32%). Found: 224.2, 226.2. 



a compound of formula IV 




10 



e 




o 
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General methodology for reporter group conjugation to 2-[2-(6-chloro- 
hexyioxy)-ethoxy]-ethylamine. To one equivalent of the succinimidyi ester of 
the reporter group in DMF is added 3 equivalence of 2-[2-(6-chlorohexyloxy)- 
ethoxy]-ethyl-ammonium trifluoro-acetate stock solution, followed by 
5 dusopropylethylamine. The reaction is stirred from 8 to 16 hours at room 
temperature. Purification is accomplished by preparative scale HPLC or silica 
gel chromatography. 

N-{2-[2-(6-CUorohexyloxy)^thoxy]-ethyl}-fluorescein-5-amide. The title 
10 compound was prepared using the above methodology. Purification was 
accomplished using preparative scale HPLC. Mass spectrum, rn/e Calcd for 
C31H31CINCV*: 580.17(100%), 581.18(32%). Found: 580.18, 581.31. 



15 a compound of formula VI 

N-{2-[2-(6-Chlorohexyloxy)^thoxy]-ethyl}-biotin-amide. The title compound 
was prepared using the above methodology. Purification was accomplished 
using silica gel chromatography (2% to 5% methanol in dichloromethane). Mass 
20 spectrum, tn/e Calcd for C^tC^V^S*: 450.22(100%), 452.22(32%). Found: 
449.95, 451.89. 
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a compound of formula VII 

5 N-{2-[2-(6-CMorohexyloxy)-ethoxy]-ethy 

6)-amide. The title compound was prepared using the above methodology. 
Purification was accomplished using preparative scale HPLC. Separation of 
structural isomers was realized. Mass spectrum, m/e Calcd for CssELfcClNsOs*: 
636.28(100%), 637.29(39.8%), 638.28(32.4%). Found: 636.14, 637.15, 638.14. 

10 




a compound of formula VIII a compound of formula DC 

1 5 N-{2-[2-(6-CMorohexylo^ R110-5-(and -6)- 

amide. The title compound was prepared using the above methodology. 
Purification was accomplished using preparative scale HPLC. Separation of 
structural isomers was realized. Mass spectrum, m/e Calcd for C3iH3 5 ClN 3 06 + : 
580.2(100%), 581.2(35.6%), 582.2(32.4%). Found: 580.4, 581.4, 582.2. 

20 
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a compound of formula X a compound of formula XI 

5 

6-({4-[4,4difluoro-5-(thiophen-2-yl)^bora-3a^a-^a2a-s-indacene-3- 
yl]phenoxy}-acetylamino>-hexanoic acid {2-[2-(6-chIorohexyloxy}-ethoxy]- 
ethyl}-amide. The title compound was prepared using the above methodology. 
Purification was accomplished using silica gel chromatography (3% to 5% 
10 methanol in dichloromethane). Mass spectrum, m/e Calcd for 
C 3 7H47BC1F2N 4 05S + : 743.3(100%). Found: 743.4. 




15 a compound of formula XII 

yl]styryloxy}-acetylamino)-hesanoic acid {2-[2-(6-chlorohesylosy)-ethoxy]- 
ethyl}-amide. The title compound was prepared using the above methodology. 
20 Purification was accomplished using silica gel chromatography (3% methanol in 
dichloromethane). Mass spectrum, m/e Calcd for C39H48BClF2N4Na05S + : 
791.3(100%). Found: 7.91.3. 
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a compound of formula XIII 

5 

Triethylammonium3-[5-[2-(4-tert-Butyl-7-diethylamino-chro 

yKdene)-ethylidene]-3-(5-{2-[2-(6-chlorohexyloxy)-ethoxy^ 

pentyl)-2,4 5 6-trioxo-tetrahydro-^ 

anion. The title compound was prepared using the above methodology. 
1 0 Purification was accomplished using preparative scale HPLC. Mass spectrum, 
m/e Calcd for C42H62CIN4O10S 849.4(100%), 850.4(48.8%), 851.4(36.4%). 
Found: 849.6, 850.5, 851.5. 




15 

a compound of formula XIV 

2-tert-Butyl^{3-[l-(5-^ 
pentyl)-3^-<IimethyI^ 
20 diethylamino-chromenylium chloride. The title compound was prepared using 
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the above methodology. Purification was accomplished using preparative scale 
HPLC. Mass spectrum, m/e Calcd for C46H67CIN3O7S*: 840^4(100%), 
841.4(54.4%). Found: 840.5, 841.5. 



5 




a compound of formula XV 

1 0 N-{2-l2-(6-CWorohexyloxy)^ 

phenyl)-oxazol-2-yl]-benzenesuIfonylamino}-propionamide. The title 
compound was prepared using the above methodology. Purification was 
accomplished using preparative scale HPLC. Mass spectrum, m/e Calcd for 
C30H40CIN4O6S": 619.2(100%), 620.2(35%). Found: 619.5, 620.7. 

15 




a compound of formula XVI 

20 N-{2-[2-(6-CWorohesylo3iy)-ethosy]-ethyl}-9 5 - 

cWorosemiaaplithofluorescein-5-(a]ttd -€>amide. The title compound was 
prepared using the above methodology. Purification was accomplished using 
preparative scale HPLC. Separation of structural isomers was realized. Mass 
spectrum, m/e Calcd for C35H34CI2MV": 666.17(100%), 668.16(64%), 

25 667.17(39.8%). Found: 666.46, 668.44, 667.51. 
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a compound of formula XVII a compound of formula 

5 XVIII 

N-{2-[2-(6-Chlorohexyloxy)-ethoxy]-ethyl}- 

seminaphthodimethyIrhodaniine-5-(and -6)-amide. The title compound was 
prepared using the above methodology. Purification was accomplished using 
1 0 preparative scale HPLC. Mass spectrum, m/e Cakd for C37H38CIN2O7": 

657.24(100%), 658.24(42%), 659.23(32%). Found: 657.46, 658.47, 659.45. 




a compound of formula XIX a compound of formula XX 



6-(3 , ? 6'-dipivaloyIfluorescein-5-(and-6)-carboxamido) hexanoic acid {2-[2- 
20 (6-chIorohexyloxy)-ethoxy]-ethyl}-amide. To a 1 00 ml round bottom flask 
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containing e-CS^e'-dipivaloylflnorescein-S-Cand^-carboxamido) hexanoic acid 
succinimidyl ester (0.195g, 0.26 mmol) was added 242-(6-chlorohexyloxy)- 
ethoxy]-ethylamine (~ 0.44 mmol) in 25 ml Et 2 0, followed by 2 ml of pyridine. 
The reaction mixture was allowed to stir overnight. After evaporation under 
5 reduced pressure, the residue was subjected to silica gel 60 column 

chromatography, progressively using 2% to 5% methanol in dichloromethane as 
eluent. The appropriate fractions were collected and dried under vacuum (0. 1 86 
g, 0.216 mmol, and 84% yield). Mass spectrum, m/e Calcd for C47H6oClN20n + : 
863.39(100%), 864.39(54.4%), 865.39(34.6%). Found: 862.94, 864.07, 864.94. 

10 




a compound of formula XXI a compound of formula XXII 



15 6-(fluorescein-5-(and-6)-carboxamido) hexanoic acid {2-[2-(6- 

cldorohexyloxy>-ethoxy]-ethyl}-amide. 6-(3 ' ,6 J -dipivdoylfluorescein-5-(and- 
6)-caiboxamido) hexanoic acid {2-[2-(6-chlorohexyloxy)-ethoxy]-ethyl}-amide 
(0.186 g, 0.216 mmol) was dissolved in 5 ml methanol and 0.5 ml 2M sodium 
carbonate(aq) added. The reaction mixture was stirred for 16 hours, then 

20 filtered. Purification was accomplished using preparative scale HPLC. 

Separation of structural isomers was realized. Mass spectrum, m/e Calcd for 
C 3 7H44C1N 2 09 + : 695.27 (100.0%), 696.28 (42.2%), 697.27 (32.3%). Found: 
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a compound of formula XXIII a compound of formula XXIV 



5 {2-[2-(4-Chlorobutoxy>ethoxy]-ethyl}-carbamic acid anthracen-9-ylmethyl 
ester. A 50 ml round bottom flask was charged with [2-(2-Hydroxyethoxy)- 
ethylj-carbamic acid anthracen-9-ylmethyl ester (0.25 g, 0.74 mmol) and fresh 
sodium hydride, 60% dispersion in mineral oil (150 mg, 3.75 mmol) under inert 
atmosphere. 10 ml anhydrous THF was added and the reaction allowed to stir 

10 for 5 minutes. After this point, l-chloro-4-Iodobutane (180 jjiI, 1.5 mmol) is 
added via syringe. The reaction is stirred at room temperature for 24 hours. 
Silica gel 60 is co-absorbed onto the reaction mixture with loss of solvent under 
reduced pressure. Silica gel column chromatography takes place initially with 
heptane as eluent, followed by 10%, 20%, and 30% ethyl acetate. A total of 0.1 

15 g (32% yield) of product is isolated from appropriate fractions: *H NMR 

(CDC1 3 ) 5 8.50 (s, H-10), 8.40 (d, H-l, 8), 8.03 (d, H-4, 5), 7.53 (dt, H-2, 3, 6, 
7), 6.15 (s, CHa-anth), 5.19 (m, exchangeable, NH), 3.93-3.32 (m, 12H) 1.69- 
1.25 (m, 4H). Mass spectrum, rn/e Calcd for C24H28CINO4 H 2 0: 447.18 
(100.0%), 448.18 (27.1%). Found: 447.17, 448.41. 

20 - 
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a compound of formula XXV 

2-(2-{2-[2-(2<!Moroethoxy}-ethoxy]-etho 2-(2- 
5 {2-[2-(2-Hydroxy-ethoxy)-ethoxy]-ethoxy}-^^ (0.5 g, 

L55 mmol) was prepared by the method of Nielsen, J. and Janda, K.D. 
(Methods: A Companion to Methods in Enzymology 6, 361-371 (1994)). To 
this reagent was added polystyrene-supported triphenylphosphine about 3 mmol 
P/g (0.67 g, 2 mmol) and 6 ml carbon tetrachloride, into a 25 ml round bottom 

1 0 fitted with a reflux condenser. The reaction set-up was sparged with argon then 
heated to reflux for 2 hours. Upon cooling, more polystyrene-supported 
triphenylphosphine (0.1 g, 0.3 mmol) was added and the reaction refluxed for an 
additional one hour. The cooled solution was filtered and the resin washed with 
additional carbon tetrachloride. Evaporation of solvent yielded 0.4 g (75.5% 

1 5 yield) of pure title compound: *H NMR (CDC1 3 ) 8 7.82 (dd, 2 H), 7.69 (dd, 2H), 
3.88 (t, 2H), 3.71 (q, 4 H), 3.63-3.56 (m, 12H). Mass spectrum, m/e Calcd for 
Ci6H 2 iClN0 5 + : 342.11 (100.0%), 344.11 (32.0%). Found: 341.65, 343.64. 




O 

20 

a compound of formula XXVI 

2-[2-(2-{2-[2-(2-CMoroethoxy)-ethox 
25 dione. The title compound was prepared according to the previous example in 
89% yield: l H NMR (CDC1 3 ) 5 7.77 (dd, 2 H), 8 7.64 (dd, 2H), 3.83 (t, 2H), 3.67 
(m, 4 H), 3.60-3.52 (m, 14H). Mass spectrum, m/e Calcd for Ci 8 H 2 5ClN0 6 + : 
386.14 (100.0%), 388.13 (32.0%). Found: 385.88, 387.83. 
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a compound of formula XXVH 

2-{2-[2-(2-{2-[2-(2-CMo^ 
5 isoindole-l,3-dione. The title compound was prepared according to the 

synthesis of 2-(2-{2-[2-(2-CMoro-ethoxy)-ethoxy]-ethoxy}-e%l)-isoindole- 
1,3-dione in 92% yield: l H NMR (CDC1 3 ) 8 7.84 (dd, 2 H), 7.71 (dd, 2H), 3.90 
(t, 2H), 3.74 (q, 4 H), 3.67-3.58 (m, 18H). Mass spectrum, m/e Calcd for 
C2oH29ClN07 + : 430.16(100.0%). Found: 429.85. 

10 




o 



a compound of formula XXVIII 

15 

VI. Exemplary Methods of Use 

The invention provides methods to monitor the expression, location 
and/or trafficking of molecules in a cell, as well as to monitor changes in 
microenvironments within a cell. In one embodiment, a mutant hydrolase and a 

20 corresponding substrate which includes a functional group are employed to label 
a cell, e.g., a cell in an organism or cell culture, or a cellular component For 
instance, cells are contacted with a vector encoding the mutant hydrolase, such 
as one encoding a fusion between the mutant hydrolase and a nuclear 
localization signal. The expression of the vector in the cell may be transient or 

25 stable- Then the cell is contacted with a substrate of the invention recognized by 
the mutant hydrolase. Alternatively, cells are concurrently contacted with the 
vector and the substrate. Then the presence or location of the functional group 
of the substrate in the cell, a lysate thereof, or a subcellular fraction thereof, is 
detected or determined. 

30 The substrates of the invention are preferably soluble in an aqueous or 

mostly aqueous solution, including water and aqueous solutions having a pH 
greater than or equal to about 6. Stock solutions of substrates of the invention, 
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however, may be dissolved in organic solvent before diluting into aqueous 
solution or buffer. Preferred organic solvents are aprotic polar solvents such as 
DMSO, DMF, N-methylpyrrolidone, acetone, acetonitrile, dioxane, 
tetrahydrofuran and other nonhydroxylic, completely water-miscible solvents. In 
5 general, the amount of substrate of the invention employed is the minimum 
amount required to detect the presence of the functional group in the sample 
comprising a mutant hydrolase or a fusion thereof, within a reasonable time, 
with minimal background or undesirable labeling. The exact concentration of a 
substrate of the invention and a corresponding mutant hydrolase to be used is 

10 dependent upon the experimental conditions and the desired results. The 

concentration of a substrate of the invention typically ranges from nanomolar to 
micromolar. The required concentration for the substrate of the invention with a 
corresponding mutant hydrolase is determined by systematic variation in 
substrate until satisfactory labeling is accomplished. The starting ranges are 

1 5 readily determined from methods known in the art. 

In one embodiment, a substrate which includes a functional group with 
optical properties is employed with a mutant hydrolase to label a sample. Such a 
substrate is combined with the sample of interest comprising the mutant 
hydrolase for a period of time sufficient for the mutant hydrolase to bind the 

20 substrate, after which the sample is illuminated at a wavelength selected to elicit 
the optical response of the functional group. Optionally, the sample is washed to 
remove residual, excess or unbound substrate. In one embodiment, the labeling 
is used to determine a specified characteristic of the sample by further 
comparing the optical response with a standard or expected response. For 

25 example, the mutant hydrolase bound substrate is used to monitor specific 

components of the sample with respect to their spatial and temporal distribution 
in the sample. Alternatively, the mutant hydrolase bound substrate is employed 
to determine or detect the presence or quantity of a certain molecule. In another 
embodiment, the mutant hydrolase bound substrate is used to analyze the sample 

30 for the presence of a molecule that responds specifically to the functional group. 
A detectable optical response means a change in, or occurrence of, a 
parameter in a test system that is capable of being perceived, either by direct 
observation or instrumentally. Such detectable responses include the change in, 
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or appearance of, color, fluorescence, reflectance, chemiluminescence, light 
polarization, light scattering, or x-ray scattering. Typically the detectable 
response is a change in fluorescence, such as a change in the intensity, excitation 
or emission wavelength distribution of fluorescence, fluorescence lifetime, 
5 fluorescence polarization, or a combination thereof. The detectable optical 
response may occur throughout the sample comprising a mutant hydrolase or a 
fusion thereof or in a localized portion of the sample comprising a mutant 
hydrolase or a fusion thereof. Comparison of the degree of optical response with 
a standard or expected response can be used to determine whether and to what 

1 0 degree the sample comprising a mutant hydrolase or a fusion thereof possesses a 
given characteristic. 

In another embodiment, the functional group is a ligand for an acceptor 
molecule. Typically, where the substrate comprises a functional group that is a 
member of a specific binding pair (a ligand), the complementary member (the 

1 5 acceptor) is immobilized on a solid or semi-solid surface, such as a polymer, 
polymeric membrane or polymeric particle (such as a polymeric bead). 
Representative specific binding pairs include biotin and avidin (or streptavidin or 
anti-biotin), IgG and protein A or protein G, drug and drug receptor, toxin and 
toxin receptor, carbohydrate and lectin or carbohydrate receptor, peptide and 

20 peptide receptor, protein and protein receptor, enzyme substrate and enzyme, 

sense DNA or RNA and antisense (complementary) DNA or RNA, hormone and 
hormone receptor, and ion and chelator. Ligands for which naturally occurring 
receptors exist include natural and synthetic proteins, including avidin and 
streptavidin, antibodies, enzymes, and hormones; nucleotides and natural or 

25 synthetic oligonucleotides, including primers for RNA and single- and double- 
stranded DNA; lipids; polysaccharides and carbohydrates; and a variety of drugs, 
including therapeutic drugs and drugs of abuse and pesticides. Where the 
functional group is a chelator of calcium, sodium, magnesium, potassium, or 
another biologically important metal ion, the substrate comprising such a 

30 functional group functions as an indicator of the ion. Alternatively, such a 

substrate may act as a pH indicator. Preferably, the detectable optical response of 
the ion indicator is a change in fluorescence. 
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The sample comprising a mutant hydrolase or a fusion thereof is typically 
labeled by passive means, i.e., by incubation with the substrate. However, any 
method of introducing the substrate into the sample comprising a mutant 
hydrolase or a fusion thereof, such as microinjection of a substrate into a cell or 
5 organelle, can be used to introduce the substrate into the sample comprising a 
mutant hydrolase or a fusion thereof. The substrates of the present invention are 
generally non-toxic to living cells and other biological components, within the 
concentrations of use. 

The sample comprising a mutant hydrolase or a fusion thereof can be 

1 0 observed immediately after contact with a substrate of the invention. The sample 
comprising a mutant hydrolase or a fusion thereof is optionally combined with 
other solutions in the course of labeling, including wash solutions, 
permeabilization and/or fixation solutions, and other solutions containing 
additional detection reagents. Washing following contact with the substrate 

1 5 generally improves the detection of the optical response due to the decrease in 
non-specific background after washing. Satisfactory visualization is possible 
without washing by using lower labeling concentrations. A number of fixatives 
and fixation conditions are known in the art, including formaldehyde, 
paraformaldehyde, formalin, glutaraldehyde, cold methanol and 3:1 

20 methanol:acetic acid. Fixation is typically used to preserve cellular morphology 
and to reduce biohazards when working with pathogenic samples. Selected 
embodiments of the substrates are well retained in cells. Fixation is optionally 
followed or accompanied by permeabilization, such as with acetone, ethanol, 
DMSO or various detergents, to allow bulky substrates of the invention, to cross 

25 cell membranes, according to methods generally known in the art. Optionally, 
the use of a substrate may be combined with the use of an additional detection 
reagent that produces a detectable response due to the presence of a specific cell 
component, intracellular substance, or cellular condition, in a sample comprising 
a mutant hydrolase or a fusion thereof. Where the additional detection reagent 

30 has spectral properties that differ from those of the substrate, multi-color 
applications are possible. 

At any time after or during contact with the substrate comprising a 
functional group with optical properties, the sample comprising a mutant 
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hydrolase or a fusion thereof is illuminated with a wavelength of light that 
results in a detectable optical response, and observed with a means for detecting 
the optical response. While some substrates are detectable colorimetrically, using 
ambient light, other substrates are detected by the fluorescence properties of the 
5 parent fluorophore. Upon illumination, such as by an ultraviolet or visible 
wavelength emission lamp, an arc lamp, a laser, or even sunlight or ordinary 
room light, the substrates, including substrates bound to the complementary 
specific binding pair member, display intense visible absorption as well as 
fluorescence emission. Selected equipment that is useful for illuminating the 
10 substrates of the invention includes, but is not limited to, hand-held ultraviolet 
lamps, mercury arc lamps, xenon lamps, argon lasers, laser diodes, and YAG 
lasers. These illumination sources are optionally integrated into laser scanners, 
fluorescence micxoplate readers, standard or mini fluorometers, or 
chromatographic detectors. This colorimetric absorbance or fluorescence 
1 5 emission is optionally detected by visual inspection, or by use of any of the 
following devices: CCD cameras, video cameras, photographic film, laser 
scanning devices, fluorometers, photodiodes, quantum counters, epifluorescence 
microscopes, scanning microscopes, flow cytometers, fluorescence microplate 
readers, or by means for amplifying the signal such as photomultiplier tubes. 
20 Where the sample comprising a mutant hydrolase or a fusion thereof is examined 
using a flow cytometer, a fluorescence microscope or a fluorometer, the 
instrument is optionally used to distinguish and discriminate between the 
substrate comprising a functional group which is a fluorophore and a second 
fluorophore with detectably different optical properties, typically by 
25 distinguishing the fluorescence response of the substrate from that of the second 
fluorophore. Where the sample comprising a mutant hydrolase or a fusion 
thereof is examined using a flow cytometer, examination of the sample 
comprising a mutant hydrolase or a fusion thereof optionally includes isolation 
of particles within the sample comprising a mutant hydrolase or a fusion thereof 
30 based on the fluorescence response of the substrate by using a sorting device. 

In one embodiment, intracellular movements may be monitored using a 
fusion of the mutant hydrolase of the invention. For example, beta-arrestin is a 
regulator of G-protein coupled receptors, that moves from the cytoplasm to the 

68 



WO 2004/072232 



PCT/US2004/002607 



cell membrane when it is activated. A cell containing a fusion of a mutant 
hydrolase and beta-airestin and a substrate of the invention allows the detection 
of the movement of beta-arrestin from the cytoplasm to the cell membrane as it 
associates with activated G-protein coupled receptors. 
5 In another embodiment, FRET may be employed with a fusion of the 

mutant 

hydrolase and a fluorescent protein, e.g., GFP, or a fusion with a protein that 
binds fluorescent molecules, e.g., O-alkylguanine-DNA alkyltransferase (AGT) 
(Keppler et al., 2003). Alternatively, a fusion of a mutant hydrolase and a 

10 protein of interest and a second fusion of a fluorescent protein and a molecule 
suspected of interacting with the protein of interest may be employed to study 
the interaction of the protein of interest with the molecule, e.g., using FRET. 
One cell may contain the fusion of a mutant hydrolase and a protein of interest 
while another cell may contain the second fusion of a fluorescent protein and a 

1 5 molecule suspected of interacting with the protein of interest. A population with 
those two cells may be contacted with a substrate and an agent, e.g., a drug, after 
which the cells are monitored to detect the effect of agent administration on the 
two populations. 

In yet another embodiment, the mutant hydrolase is fused to a fluorescent 
20 protein. The fusion protein can thus be detected in cells by detecting the 

fluorescent protein or by contacting the cells with a substrate of the invention 
and detecting the functional group in the substrate. The detection of the 
fluorescent protein may be conducted before the detection of the functional 
group. Alternatively, the detection of the functional group may be conducted 
25 before the detection of the fluorescent protein. Moreover, those cells can be 
contacted with additional substrates, e.g., those having a different functional 
group, and the different functional group in the cell detected, which functional 
group is covalently linked to mutant hydrolase not previously bound by the first 
substrate. 

30 In yet another embodiment, a fusion of a mutant hydrolase and a 

transcription factor may be employed to monitor activation of transcription 
activation pathways. For example, a fusion of a mutant hydrolase to a 
transcription factor present in the cytoplasm in an inactive form but which is 
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translocated to the nucleus upon activation (e.g., NF kappa Beta) can monitor 
transcription activation pathways. 

In another embodiment, biotin is employed as a functional group in a 
substrate and the fusion includes a mutant hydrolase fused to a protein of interest 
5 suspected of interacting with another molecule, e.g., a protein, in a cell. The use 
of such reagents permits the capture of the other molecule which interacts in the 
cell with the protein fused to the mutant hydrolase, thereby identifying and/or 
capturing (isolating) the interacting molecule(s). 

In one embodiment, the mutant hydrolase is fused to a protein that is 

1 0 secreted. Using that fusion and a substrate of the invention, the secreted protein 
may be detected and/or monitored. Similarly, when the mutant hydrolase is 
fused to a membrane protein that is transported between different vesicular 
compartments, in the presence of the substrate, protein processing within these 
compartments can be detected. In yet another embodiment, when the mutant 

15 hydrolase is fused to an ion channel or transport protein, or a protein that is 
closely associated with the channel or transport protein, the movement of ions 
across cell or organelle membranes can be monitored in the presence of a 
substrate of the invention which contains an ion sensitive fluorophore. Likewise, 
when the mutant hydrolase is fused to proteins associated with vesicals or 

20 cytoskeleton, in the presense of the substrate, transport of proteins or vesicals 
along cytoskeletal structures can be readily detected. 

In another embodiment, the functional group is a drug or toxin. By 
combining a substrate with such a functional group with a fusion of a mutant 
hydrolase and a targeting molecule such as an antibody, e.g., one which binds to 

25 an antigen associated with specific tumor cells, a drug or toxin can be targeted 
within a cell or within an animal. Alternatively, the functional group may be a 
fluorophore which, when present in a substrate and combined with a fusion of a 
mutant hydrolase and a targeting molecule such as a single chain antibody, the 
targeting molecule is labeled, e.g., a labeled antibody for in vitro applications 

30 such as an ELISA. 

In yet another embodiment, when fused to a protein expressed on the cell 
surface, a mutant hydrolase on the cell surface, when combined with a substrate 
of the invention, e.g., one which contains a fluorophore, may be employed to 
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monitor cell migration (e.g., cancer cell migration) in vivo or in vitro. In one 
embodiment, the substrate of the invention is one that has low or no permeability 
to the cell membrane. Alternatively, such a system can be used to monitor the 
effect of different agents, e.g., drugs, on different pools of cells. In yet another 

5 embodiment, the mutant hydrolase is fused to a HERG channel. Cells 

expressing such a fusion, in the presence of a substrate of the invention which 
includes a K+-sensitive fluorophore, may be employed to monitor the activity of 
the HERG channel, e.g., to monitor drag-toxicity. 

In another embodiment, the substrate of the invention includes a 

10 functional group useful to monitor for hydrophobic regions, e.g., Nile Red, in a 
cell or organism. 

Thus, the mutant hydrolases and substrates of the invention are useful in 
a wide variety of assays, e.g., phage display, panning, ELISA, Western blot, 
fluorometric microvolume assay technology (FMAT), and cell and subcellular 
15 staining. 

The invention will be further described by the following non-limiting 
examples. 

Example I 

20 General Methodologies 

Unless defined otherwise, all technical and scientific terms used herein 
have the same meaning as commonly understood by one of ordinary skill in the 
field of molecular biology and cellular signaling and modeling. Generally, the 
nomenclature used herein and the laboratory procedures in spectroscopy, drug 

25 discovery, cell culture, molecular genetics, plastic manufacture, polymer 
chemistry, diagnostics, amino acid and nucleic acid chemistry, and alkane 
chemistry described below are those well known and commonly employed in the 
art. Standard techniques are typically used for preparation of plastics, signal 
detection, recombinant nucleic acid methods, polynucleotide synthesis, and 

30 microbial culture and transformation (e.g., electroporation, lipofection). 

The techniques and procedures are generally performed according to 
conventional methods in the art and various general references (see generally, 
Sambrook et. al. Molecular Cloning: A laboratory manual, 2d ed. (1989) Cold 
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Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., and Lakowicz, J.R. 
Principles of Fluorescence Spectroscopy, New York: Plenum Press (1983) for 
fluorescent techniques, which are incorporated herein by reference) and which 
are provided throughout this document. Standard techniques are used for 
5 chemical synthesis, chemical analysis, and biological assays. 
Materials 

All oligonucleotides were synthesized, purified and sequenced by 
Promega Corporation (Madison, WI) or the University of Iowa DNA Facility 
(Iowa City, Iowa). Restriction enzymes and DNA modifying enzymes were 

1 0 obtained from Promega Corporation (Madison, WI), New England Biolabs, Inc. 
(Beverly, MA) or Stratagene Cloning Systems (La Jolla, CA), and were used 
according to the manufacturer's protocols. Competent E. coti JM109 were 
provided by Promega Corporation or purchased from Stratagene Cloning 
Systems. Small-scale plasmid DNA isolations were done using the Qiagen 

1 5 Plasmid Mini Kit (Qiagen Inc., Chatsworth, CA). DNA ligations were 
performed with pre-tested reagent kits purchased from Stratagene Cloning 
Systems. DNA fragments were purified with QIAquick Gel Extraction Kits or 
QIAquick PCR purification Kits purchased from Qiagen Inc. 

The vectors used for generating DhaA mutants and their fusions were as 

20 follows: pET21 (Invitrogen, Carlsbad, CA), pRL-null (Promega, Madison, WI), 
pGEX-5x-3 (Amersham Biosciences; Piscataway, NJ), and EGFP and DsRED2 
(both from CLONTECH, Palo Alto, CA),. 

SDS-polyacrylamide gels and associated buffers and stains, as well as 
electroblot transfer buffers, were obtained from Bio Whittaker Molecular 

25 Applications (Rockland, ME). Protein molecular weight standards were 
purchased from Invitrogen. 

Sigma-Aldrich was the source of Anti Flag R monoclonal antibody 
antibodies (anti FLAG R M2 monoclonal antibody (mouse) (F3165)), Anti 
FLAG R M2 HRP Conjugate and Anti FLAG R M2 FITC conjugate (A8592 and 

30 F4049, respectively). Chemicon (Temecula, CA) was the source of monoclonal 
emti-Renilla luciferase antibody (MAB4410). Promega Corp. was the source of 
HRP-conjugated goat anti-mouse IgG and HRP-conjugated streptavidin (W4021 
and G7 1 4, respectively). 
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1-Cl-butane, 1-Cl-hexane, 1-Cl-octane, 1-Cl-decane, 1-Cl-butanol, 1-C1- 
hexanol, 1-Cl-octanol, and 1-Cl-decanol were obtained from Aldrich or from 
Fluka (USA). All salts, monobasic potassium phosphate, dibasic potassium 
phosphate, imidazole, HEPES, sodium EDTA, ammonium sulfate, and Tris free 
5 base were from Fisher (Biotech Grade). 

Glutathione Sepharose 4 FF, glutathione, MonoQ and Sephadex G-25 
prepackaged columns were from Amersham Biosciences. 

Luria-Broth ("LB") was provided by Promega Corporation. 
Methods 

10 PCR reactions. DNA amplification was performed using standard 

polymerase chain reaction buffers supplied by Promega Corp. Typically, 50 ^il 
reactions included Ix concentration of the manufacturer's supplied buffer, 1 .5 
mM MgCl 2 , 125 |oM dATP, 125 *iM dCTP, 125 jiM dGTP, 125 jiM dTTP, 0.10- 
1 .0 \xM forward and reverse primers, 5 U AmpliTaq® DNA Polymerase and < 1 

15 ng target DNA. Unless otherwise indicated, the thermal profile for amplification 
of DNA was 35 cycles of 0.5 minutes at 94°C; 1 minute at 55°C; and 1 minute at 
72°C. 

DNA sequencing. All clones were confirmed by DNA sequencing using 
the dideoxy-terminal cycle-sequencing method (Sanger et al., 1977) and a 

20 Perkin-Elmer Model 3 1 0 DNA sequencer. (Foster City, CA). 

SDS-PAGE. Proteins were solubilized in a sample buffer (1% SDS, 10% 
glycerol, and 1.0 mM p-mercaptoethanol, pH 6.8; Promega Corporation), boiled 
for 5 minutes and resolved on SDS-PAGE (4-20% gradient gels; BioWhittaker 
Molecular Applications). Gels were stained with Coomassie Blue (Promega 

25 Corp.) for Western blot analysis or were analyzed on a fluoroimager (Hitachi, 
Japan) at an Eex/Eem appropriate for each fluorophore evaluated. 

Western blot analysis. Electrophoretic transfer of proteins to a 
nitrocellulose membrane (0.2 pm, Scheicher & Schuell, Germany) was carried 
out in 25 mM Tris base/188 mM glycine (pH 8.3), 20% (v/v) methanol for 2.0 

30 hours with a constant current of 80 mA (at 4°C) in Xcell II Blot module 

(Invitrogen). The membranes were rinsed with TBST buffer (10 mM Tris-HCl, 
150 mM NaCl, pH 7.6, containing 0.05% Tween 20) and incubated in blocking 
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solution (3% dry milk or 1% BSA in TBST buffer) for 30 minutes at room 
temperature or overnight at 4°C. Then membranes were washed with 50 ml of 
TBST buffer and incubated with anti-FLAG R monoclonal antibody M2 (dilution 
1 :5,000), anti-Renilla hiciferase monoclonal antibody (dilution 1:5,000), or 
5 HRP-conjugated streptavidin (dilution 1 : 1 0,000) for 45 minutes at room 
temperature. Then the membranes were washed with TBST buffer (50 ml, 5 
minutes, 3 times). The membranes that had been probed with antibody were 
then incubated with HRP-conjugated donkey anti-mouse IgG (30 minutes, room 
temperature) and then the washing procedure was repeated. The proteins were 

1 0 visualized by the enhanced chemiluminescence (ECL) system (Pharmacia- 
Amersham) according to the manufacturer's instructions. Levels of proteins 
were quantified using computer-assisted densitometry. 

Protein concentration. Protein was measured by the microtiter protocol 
of the Pierce BCA Protein assay (Pierce, Rockford, IL) using bovine serum 

15 albumin (BSA) as a standard. 

Statistic analysis. Data were expressed as mean +/- S.E.M. values from 
experiments performed in quadruplicate, representative of at least 3 independent 
experiments with similar results. Statistical significance was assessed by the 
student's t test and considered significant when p < 0.05. 

20 Bacterial cells. The initial stock of Dh5a cells containing pET-3a with 

Rhodococcus rodochorus (DhaA) was kindly provided by Dr. Clifford J. 
Unkefer (Los Alamos National Laboratory, Los Alamos, NM) (Schindler et aL, 
1999; Newman et aL, 1999). Bacteria were cultured in LB using a premixed 
reagent provided by Promega Corp. Freezer stocks of E. coli BL21 (A,DE3) 

25 pET3a (stored in 10% glycerol, -80°C) were used to inoculate Luria-Bertani agar 
plates supplemented with ampicillin (50 ng/ml) (Sambrook et aL, 1989). Single 
colonies were selected and used to inoculate two 10 ml cultures of Luria-Bertani 
medium containing 50 pg/ml ampicillin. The cells were cultured for 8 hours at 
37°C with shaking (220 rpm), after which time 2 ml was used to inoculate each 

30 of two 50 ml of Luria-Bertani medium containing 50 jug/ml ampicillin, which 
were grown overnight at 37°C with shaking. Ten milliliters of this culture was 
used to inoculate each of two 0.5 L Luria-Bertani medium with ampicillin. 
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When the A^oo of the culture reached 0.6, isopropyl- 1 -thio-P -D- 
galactopyranoside (IPTG) was added to a final concentration of 0.5 mM, and 
cultures were maintained for an additional 4 hours at 30°C with shaking. The 
cells were then harvested by centrifugation and washed with 10 mM Tris-SC>4, 1 

5 mM EDTA, pH 7.5. The cell pellets were stored at -70°C prior to cell lysis. 

Mammalian cells. CHO-K1 cells (ATCC-CCL61) were cultured in a 1 :1 
mixture of Ham's F12 nutrients and Dulbecco's modified minimal essential 
medium supplemented with 10% fetal bovine serum (FBS), 100 U/ml penicillin, 
and 100 mg/ml streptomycin, in an atmosphere of 95% air and 5% CO2 at 37°C. 

10 Rat hippocampal (E18) primary neurons were isolated as described 

below. Briefly, fragments of embryonic (El 8) rat hippocampus in Hibernate™ 
E media (GIBCO, Invitrogen, Carlsbad, CA), obtained from Dr. Brewer 
(Southern Illinois University), were dissociated and plated on poly-D-lysin 
coated (0.28 mg/cm 2 ; Sigma) glass/plastic-ware and cultured in serum-free 

1 5 Neurobasal™ media with B27 supplement (NB27, GIBCO). All media were 
changed every 2-3 days. 

Transfection. To study transient expression of different proteins, cells 
were plated in 35 mm culture dishes or 24 well plates. At about 80-90% 
confluency, the cells were exposed to a mixture of lipofeptamine/DNA/antibiotic 

20 free media according to the manufacturer' s (GIBCO) instructions. The 

following day, media was replaced with fresh media and cells were allowed to 
grow for various periods of time. 

Fluorescence. Fluorescence in cells in 96 well plates was measured on 
fluorescent plate reader CytoFluorll (Beckman) at an Eex/Eem appropriate for 

25 particular fluorophores (e.g., E«/Ecm for TAMRA is 540/575 nm). 

Example H 
A DhaA-Based Tethering System 
A. Wild-Type and Mutant DhaA Proteins and Fusions Thereof 
30 A halo-alkane dehydrogenase from Rhodococcus rhodochrous is a 

product of the DhaA gene (MW about 33 kDa). This enzyme cleaves carbon- 
halogen bonds in aliphatic and aromatic halogenated compounds, e.g., HaloC3- 
HaloCio. The catalytic center of DhaA is a typical "catalytic triad", comprising a 
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nucleophile, an acid and a histidine residue. It is likely that substrate binds to 
DhaA to form an E*S complex, after which nucleophilic attack by Aspl06 forms 
an ester intermediate, His272 then activates H2O that hydrolyzes the 
intermediate, releasing product from the catalytic center. To determine whether 
5 a point mutation of the catalytic His272 residue impairs enzymatic activity of the 
enzyme so as to enable covalent tethering of a functional group (FG) to this 
protein, mutant DhaAs were prepared. 
Materials and Methods 

To prepare mutant DhaA vectors, Promega's in vitro mutagenesis kit 
1 0 which is based on four primer overlap-extension method was employed (Ho et 
al., 1989) to produce DhaA.H272 to F, A, G, or H mutations. The external 
primers were oligonucleotides 5'- 

GCTTCACITGTCGTCATCGTCCTTGTAGTCA-3' (SEQ ID NO: 1) and 5'- 
GCTTC ACTTGTCGTC ATCGTCCTTGTAGTC A-3 ' (SEQ ID NO:2), and the 

1 5 internal mutagenic primers were as follows: H272F (5'~ 

CCGGGATTGTTCTACCTCCAGGAAGAC--3'), SEQ ID NO:3), H272A (5'- 
CCGGGATTGGCCTACCTCCAGGAAGAC-3'; SEQ ID NO:4), H272G (5'- 
CCGGGATTGCAGTACCTCCAGGAAGAC-3'; SEQ IDNO:5), and H272Q 
(5'-CCGGGATTGGGCTACCTCCAGGAAGAC-3 f ; SEQ ID NO:6) (the 

20 mutated codons are underlined). The mutated dehalogenase genes were 

subcloned into the pET-3 a vector. For overexpression of mutant dehalogenases, 
the pET~3a vector was transformed into competent E. coli BL21 (DE3). The 
DhaA sequence in clones was confirmed by DNA sequencing. 

GST-DhaA (WT or H272F/A/G/H mutants) fusion cassettes were 

25 constructed by cloning the appropriate DhaA coding regions into SaWNotI sites 
of pGEX5x3 vector. Two primers (5'- 

ACGCGTCGACGCCGCCATGTCAGAAATCGGTACAGGC-3' and 5'- 
ATAAGAATGCGGCCGCTCAAGCGCTTCAACCGGTGAGTGCGGGGAGC 
CAGCGCGC-3'; SEQ ID NOs:7 and 8, respectively) were designed to add a 
30 Sail site and a Kozak consensus sequence to the 5' coding regions of DhaA, to 
add a Noil, EcoR47lU, and Age! restriction site and stop codons to the 3' coding 
region of DhaA, and to amplify a 897 bp fragment from a DhaA (WT or mutant) 
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template. The resulting fragments were inserted into the SaWNotl site ofpGEX- 
5X-3, a vector containing a glutathione S-transferase (GST) gene, a sequence 
encoding a Factor Xa cleavage site, and multiple cloning sites (MCS) followed 
by a stop codon. 

5 A Flag coding sequence was then inserted into the AgeVEcoR47Hl 

restriction sites of the pGEX5X-3 vector. In frame with the six nucleotide Agel 
site is a sequence for an 1 1 amino acid peptide, the final octapeptide of which 
corresponds to the Flag peptide (Kodak Imaging Systems, Rochester, NY). Two 
complementary oligonucleotides (5'- 

10 CCGGTGACTACAAGGACGATGACGACAAGTGAAGC-3', sense, SEQ ID 
NO:9, and 5 f -GCTTCACITGTCGTCATCGTCCTrGTAGTCA-3', antisense, 
SEQ ID NO:10) coding the Flag peptide (Kodak Imaging Systems, Rochester, 
NY) were annealed. The annealed DNA had an Agel site at the 5' end and an 
EcoR47El at die 3' end. The annealed DNA was digested with Agel and 

1 5 EcoR47lll and then subcloned into the GST-DhaA.WT or GST-DhaA.H272F 
mutant constructs at the Agel and EcoR47TSl sites. All gene fusion constructs 
were confirmed by DNA sequencing. 

To generate GST-DhaA fusion proteins, enzyme expression was induced 
by the addition of isopropyl-b-D-thiogalactopyranoside (at a final concentration 

20 of 0.5 mM) when the culture reached an optical density of 0.6 at 600 nm. The 
cells were harvested in Buffer A (10 mM Tris-SC>4, 1 mM EDTA, 1 mM p- 
mercaptoethanol, and 10 % glycerol, pH 7.5), and disrupted by sonication using 
a Vibra Cell™ sonicator (Sonics & Materials, Danbury, CT, USA). Cell debris 
was removed by centrifugation at 19,800 x g for 1 hour. The crude extract was 

25 further purified on a GSS-Sepharose 4 fast flow column (Amersham 

Biosciences; Piscataway, NJ) according to the manufacturer's instructions. The 
elution fractions containing GST-DhaA fusion protein were pooled, dialyzed 
against a 10 mM Tris-S0 4 buffer (containing 20 mM Na 2 S0 4 and 1 mM EDTA- 
Na 2 ) overnight at 4°C, and stored at 

30 -20°C until use. To generate DhaA (WT or mutant), GST was cleaved from the 
fusion proteins with Factor Xa, and the products purified on GSS-Sepharose 4 
(Amersham Biosciences; Piscataway, NJ) according to the manufacturer's 
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instructions. Homogeneity of the proteins was verified by SDS-PAGE. In some 
experiments, the cell free extract was fractionated using 45-70% saturated 
ammonium sulfate as described by Newman et al. (1999). 
Results 

5 Figure 3 shows robust, IPTG inducible production of GST-DhaA.WT- 

Flag (lane 1) and GST~DhaA.H272F-Flag (lane 2) fusion proteins. Moreover, 
the proteins were soluble and could be efficiently purified on GSS-Sepharose 
4FF (lanes 5-10, odd numbered lanes correspond to GST-DhaA.WT-Flag and 
even numbered lanes correspond to GST-DhaA.H272F-Flag). Treatment of the 
1 0 fusion proteins with Factor Xa led to the formation of two proteins GST and 
DhaA (WT or mutant, lanes 1 1 and 12, respectively), and GST was efficiently 
removed on GSS-Sepharose 4FF (WT or mutant, lanes 13 and 14, respectively). 
In addition, all proteins had the predicted molecular weight 
B. Mutation of H272 Impairs Ability of DhaA to Hvdrolvze Cl-Alkanes. 
1 5 Inability of an enzyme to release product of the enzymatic reaction into 

surrounding media is essential for the tethering system. This inability can be 
detected by significant reduction of the hydrolytic activity of the enzyme. 

To study the effect of a point mutation on the activity of DhaA (WT or 
mutant) hydrolysis of Cl-alkanes, a pH-indicator dye system as described by 
20 Holloway et al. (1998) was employed. 
Materials and Methods 

The reaction buffer for a pH-indicator dye system consisted of 1 mM 
HEPES-SO4 (pH 8.2), 20 mM Na 2 S0 4 , and 1 mM EDTA. Phenol red was added 
to a final concentration 25 |ig/ml. The halogenated compounds were added to 
25 apparent concentrations that could insure that the dissolved fraction of the 
substrate was sufficient for the maximum velocity of the dehalogenation 
reaction. The substrate-buffer solution was vigorously mixed for 30 seconds by 
vortexing, capped to prevent significant evaporation of the substrate and used 
within 1-2 hours. Prior to each kinetic determination, the phenol red was titrated 
30 with a standardized solution of HC1 to provide an apparent extinction coefficient. 
The steady-state kinetic constants for DhaA were determined at 558 nm at room 
temperature on a Beckman Du640 spectrophotometer (Beckman Coulter, 
Fullerton, CA). Kinetic constants were calculated from initial rates using the 
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computer program SigmaPlot. One unit of enzyme activity is defined as the 
amount required to dehalogenate 1 .0 mM of substrate/miniite under the specific 
conditions. 
Results 

5 As shown in Figure 4, using 0.1 mg/ml of enzyme and 10 mM substrate 

at pH 7,0-8.2, no catalytic activity was found with any of four mutants. Under 
these conditions, the wild-type enzyme had an activity with 1-Cl-butane of 5 
units/mg of protein. Thus, the activity of the mutants was reduced by at least 
700-fold. 

10 Aliquots of the supernatant obtained from E. coli expressing DhaA (WT 

or one of the mutants) were treated with increasing concentrations of (NH^SO^ 
The proteins were exposed to each (NH^SC^ concentration for 2 hours (4°C), 
pelleted by centrifugation, dialyzed overnight against buffer A, and resolved on 
SDS-PAGE. 

15 As shown in Figure 5, a major fraction of DhaA. WT and the 

DhaA.H272F mutant was precipitated by 45-70% of (NH 4 ) 2 S0 4 . No 
precipitation of these proteins was observed at low (NH 4 ) 2 S0 4 concentrations. 
In contrast, the DhaA.H272Q, DhaA.H272G and DhaA.H272A mutants could be 
precipitated by 1 0% (NH 4 ) 2 S0 4 . This is a strong indication of the significant 

20 change of the physico-chemical characteristics of the DhaA.H272Q, 

DhaA.H272G and DhaA.H272A mutants. At the same time, the DhaA.H272F 
mutation had no significant effect on these parameters. These data are in good 
agreement with results of computer modeling of the effect of mutations on the 3- 
D structure of DhaA, indicating that among all tested mutants, only the 

25 DhaA.H272F mutation had no significant effect on the predicted 3 -dimensional 
model (see Figure 2). Based on these results, DhaA.H272F was chosen for 
further experiments. 

To form a covalent adduct, the chlorine atom of Cl-alkane is likely 
positioned in close proximity to the catalytic amino acids of DhaA (WT or 

30 mutant) (Figure 2). The crystal structure of DhaA (Newman et aL, 1999) 

indicates that these amino acids are located deep inside of the catalytic pocket of 
DhaA (approximately 10 A long and about 20 A 2 in cross section). To permit 
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entry of the reactive group in a substrate for DhaA which includes a functional 
group into the catalytic pocket of DhaA, a linker was designed to connect the Cl- 
containing substrate with a fianctional group so that the functional group is 
located outside of the catalytic pocket, i.e., so as not to disturb/destroy the 3-D 
5 structure of DhaA. 

To determine if DhaA is capable of hydrolyzing Cl-alkanes with a long 
hydrophobic carbon chain, DhaA. WT was contacted with various Cl-alkane 
alcohols. As shown in Figure 6, DhaA.WT can hydrolyze 1 -Cl-alkane alcohols 
with 4-10 carbon atoms. Moreover, the initial rate of hydrolysis (IRH) of CI- 
1 0 alkanes had an inverse relationship to the length of a carbon chain, although poor 
solubility of long-chain Cl-alkanes in aqueous buffers may affect the efficiency 
of the enzyme-substrate interaction. Indeed, as shown in Figure 6, the IRH of 1- 
Cl-alkane- 1 0-decanol is much higher than the IRH of 1 -Cl-decane. More 
importantly, these data indicate that DhaA can hydrolyze Cl-alkanes containing 
1 5 relatively polar groups (e.g., HO-group). 

FAM-modified Cl-alkanes with linkers of different length and/or 
hydrophobicity were prepared (Figure 7). DhaA. WT efficiently hydrolyzed Cl- 
alkanes with a relatively bulky functional group (FAM) if the linker was 12 or 
more atoms long. No activity of DhaA.H272F/A/G/Q mutants was detected 
20 with any of the tested Cl-alkanes (data not shown). In addition, modification of 
the (CH 2 )6 region adjacent to the Cl-atom led to a significant reduction of the 
IRH of the 14-atom linker by DhaA.WT. Nevertheless, if the length and 
structure of the linker is compatible with the catalytic site of a hydrolase, the 
presence of a linker in a substrate of the invention has substantially no effect on 
25 the reaction. 

Some of the samples were analyzed on an automated HPLC (Hewlett- 
Packard Model 1050) system. A DAD detector was set to record UV-visible 
spectra over the 200-600 nm range. Fluorescence was detected at an Ecx/E em 
equal 480/520 nm and 540/575 nm for FAM- and TAMRA-modified substrates, 
30 respectively. Ethanol extracts of Cl-alkanes or products of Cl-alkane hydrolysis 
were analyzed using analytical reverse phase d s column (Adsorbosphere HS, 
5ja, 150 x 4.6 mm; Hewlett-Packard, Clifton, NJ) with a linear gradient of 1 0 
mM ammonium acetate (pH 7.0):ACN (acetonitrile) from 25:75 to 1:99 (v/v) 
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applied over 30 minutes at 1 .0 ml/minute. Quantitation of the separated 
compounds was based on the integrated surface of the collected peaks. 

Figure 8A shows the complete separation of the substrate and the product 
of the reaction. Figure 8B indicates that wild-type DhaA very efficiently 
5 hydrolyzed FAM-C14H24O4-CI. Similar results were obtained when TAMRA- 
C14H24O4-CI or ROX.5-C14H24O4-CI were used as substrates (data not shown). 
Taken together these data confirm the results of the pH-indicator dye-based 
assay showing complete inactivation of DhaA by the DhaA.H272F mutation. 
C. Covalent Tethering of Functional Groups to DhaA Mutants In Vitro 

10 Materials and Methods 

MALDI analysis of proteins was performed at the University of 
Wisconsin Biotechnology Center using a matrix assisted laser 
desorption/ionization time-of-life (MALDI-TOF) mass spectrometer Bruker 
Biflex III (Bruker, USA.). To prepare samples, 100 jxg of purified DhaA (WT or 

1 5 H272F mutant) or GST-DhaA (WT or H272F mutant) fusion protein (purified to 
about 90% homogeneity) in 200 fd of buffer (1 mM HEPES-SO4 (pH 7.4), 20 
mM Na2SC>4 ? and 1 mM EDTA) were incubated with or without substrate 
(FAM-C14H24O4-CI, at 1.0 mM, final concentration) for 15 minutes at room 
temperature. Then the reaction mixtures were dialyzed against 20 mM 

20 CH3COONH4 (pH 7.0) overnight at 4°C and M/Z values of the proteins and 
protein-substrate complexes determined. 

Oligonucleotides employed to prepare DhaA.D106 mutants include for 
DhaAD106C: 

5-CTrGGGTTTGGAAGAGGTCGTCCTGGTCATCCACT 
25 (SEQIDNO:13)and5 , - 

TGAGCCCCAGCAGTGGATGACCAGGACGACCTCTTCCAAACC-3 1 (SEQ 

ID NO: 14); 

forDhaA.D106Q: 

5 , -CITGGGITTGGAAGAGGTCGTCCTGGTCATCCA CCAGT GGGGC-3 , 
30 (SEQIDNO:34)and5'- 

TGAGCCCCACTGGTGGATGACCAGGACGACCTCTTCCAAACC-3 T (SEQ 

IDNO:35); 

forDhaA.D106E: 
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S'-CTTGGGTTTGGAAGAGGTCGTCCTGGTCATCCACGAATGGGGC-B' 
(SEQroNO:52)and5'- 

TGAGCCCCATTCGTGGATGACCAGGACGACCTCTTCCAAACC-3' (SEQ 
IDNO:53);and 
5 forDhaA.D106Y: 

5 l -CTTGGGTTTGGAAGAGGTCGTCCTGGTCATCCAC^ACTGGGGC-3 , 
(SEQIDNO^andS'- 

TGAGCCCCAGTAGTGGATGACCAGGACGACCTCTTCCAAACC-3' (SEQ 
ID NO:55). The annealed oligonucleotides contained a Styl site at the 5' end and 
10 the Blpl site at the 3' end. The annealed oligonucleotides were digested with Styl 
and Blpl and subcloned into GST-DhaAWT or GST-DhaA.H272F at Styl and 
Blpl sites. All mutants were confirmed by DNA sequencing. 
Results 

To confirm that DhaA.H272 mutants were capable of binding Cl-alkanes 

1 5 with functional groups, these mutants or their GST-fusions, as well as the 
corresponding wild-type proteins or fusions, were contacted with FAM- 
Ci4H2404-Cl,TAMRA-Ci 4 H2404-Cl, ROX.5-C14H24O4-C], or biotin-CigHsaCV 
Cl for 15 minutes at room temperature. Then the proteins were resolved on 
SDS-PAGE. The gels containing proteins were incubated with FAM-C14H24O4- 

20 CI, TAMRA-C14H24O4-CI, or ROX.5-C4H24O4-CI and were analyzed by 
fluoroimager (Hitachi, Japan) at an Eex/Eem appropriate for each fluorophore. 
Gels containing proteins incubated with biotm-Ci8H 3 204-Cl were transferred to a 
nitrocellulose membrane and probed with HRP conjugated streptavidin. 

As shown in Figure 9, TAMRA-C14H24O4-CI (lanes 1 and 2 in panel A), 

25 FAM-C14H24O4-CI (lanes 3 and 4 in panel A), and ROX.5-C14H24O4-CI (lanes 5 
and 6 in panel A) bound to DhaA.H272F (lanes 2, 4 and 6 in panel A) but not to 
DhaAWT (lanes 1, 3 and 5 in panel A). Biotm-Ci8H340 4 -Cl bound to 
DhaA.H272F (lanes 9-14 in panel B) but not to DhaAWT (lanes 1-8 in panel 
B). Moreover, the binding of biotin-C 18 H 3 40 4 -Cl to DhaAH272F (lanes 9-14 in 

30 panel B) was dose dependent and could be detected at 0.2 uM. Further, the bond 
between substrates and DhaA.H272F was very strong, since boiling with SDS 
did not break the bond. 
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All tested DhaA.H272 mutants, i.e. H272F/G/A/Q, bound to TAMRA- 
C14-CI (Figure 10). Further, the DhaAH272 mutants bind the substrates in a 
highly specific manner, since pretreatment of the mutants with one of the 
substrates (biotin-QgH^CVCl) completely blocked the binding of another 
5 substrate (TAMRA-C14H24O4-CI) (Figure 1 0). 

To determine the nature of the bond between Cl-alkanes and the 
DhaA.H272F mutant (or the GST-DhaA.H272F mutant fusion protein), these 
proteins were incubated with and without FAM-C14H24O4-CI, and analyzed by 
MALDI. As shown in Figure 1 1 , the bond between mutant DhaA.H272F and 

10 FAM-C14H24O4-CI is strong. Moreover, the analysis of the E*S complex 
indicated the covalent nature of the bond between the substrate (e.g., FAM- 
C14H24O4-CI) and DhaA.H272F, The MALDI-TOF analysis also confirms that 
the substrate/protein adduct is formed in a 1 :1 relationship. 

DhaA mutants at another residue in the catalytic triad, residue 106, were 

1 5 prepared. The residue at position 1 06 in wild-type DhaA is D, one of the known 
nucleophilic amino acid residues. D at residue 106 in DhaA was substituted 
with nucleophilic amino acid residues other than D, e.g., C, Y and E, which may 
form a bond with a substrate which is more stable than the bond formed between 
wild-type DhaA and the substrate. In particular, cysteine is a known nucleophile 

20 in cysteine-based enzymes, and those enzymes are not known to activate water. 
A control mutant, DhaA.D106Q, single mutants DhaA.D106C, 
DhaA.D106Y, and DhaA.D106E, as well as double mutants 
DhaA.D106C:H272F, DhaA.D106E:H272F, DhaA.D106Q:H272F, and 
DhaA.D106Y:H272F were analyzed for binding to TAMRA-C14H24O4-CI 

25 (Figure 12). As shown in Figure 12, TAMRA-C14H24O4-CI bound to 

DhaA.D106C, DhaA.D106C:H272F, DhaA.D106E, and DhaA.H272F. Thus, 
the bond formed between TAMRA-C14H24O4-CI and cysteine or glutamate at 
residue 106 in a mutant DhaA is stable relative to the bond formed between 
TAMRA-C14H24O4-CI and wild-type DhaA. Other substitutions at position 106 

30 alone or in combination with substitutions at other residues in DhaA may yield 
similar results. Further, certain substitutions at position 1 06 alone or in 
combination with substitutions at other residues in DhaA may result in a mutant 
DhaA that forms a bond with only certain substrates. 
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ExampleTn 

Tethering of Luciferase to a Solid Support via a 
Mutant DhaA and a Substrate of the Invention 
5 Materials and Methods 

phRLuc-linker-DhaA.WT-Flag and phRLuc-hnker-DhaA.H272F-Flag 
fusion cassettes were constructed by cloning the phRLuc coding region into the 
NheVSaH sites of the pCIneo vector which contains a myristic acid attachment 
peptide coding sequence (MAS). Two primers (5'~ 
1 0 GCITCACITGTCGTCATCGTCCrTGTAGTCA-3'; SEQ ID NO: 1 1) and (5'- 
GCTTCACTTGTCGTCATCGTCCTTGTAGTCA-3'; SEQ ID NO:12) were 
designed to add Nliel and Sail sites to the S 1 and 3 f coding regions, respectively, 
of phRLuc and to amplify a 900 bp fragment from a phRLuc template (pGL3 
vector, Promega). Then, a myristic acid attachment peptide coding sequence 
1 5 was excised with Nhel and Sail restriction enzymes and the amplified fragment 
containing phRLuc was inserted into the NheVSali restriction sites of 
pCtoeo.DhaA.(WT or H272F)-Flag vector. The sequence of each construct was 
confirmed by DNA sequencing. Promega's TNT® T7Quick system was then 
used to generate fusion proteins in vitro. 
20 Results 

To demonstrate tethering of proteins to a solid support via DhaA.H272F- 
Cl-alkane bridge, vectors encoding a fusion protein of Renitta luciferase (hRLuc, 
N-tenninus of the fusion), a protein connector (17 amino acids, see Table I), and 
DhaA (WT or H272F mutant) were prepared. The Flag epitope was then fused 
25 to the C-terminus of DhaA. 
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Table I 



Fusion 



Sequence 



Peptide 
Connector 



GST-DhaA 



atcgaaggtcgtgggatccccaggaatt 
cccgggtcgacgccgcc (SEQ ID 
NO:26) 

tccggatcaagcttgggcgacgaggtgg 

acggcgggccctctagagccacc 

(SEQIDNO:28) 

accggttccggatcaagcttgcggtacc 

gcgggccctctagagcc (SEQ ID 

NO:30) 

tccggatcaagcttgcggtaccgcgggc 

cctctagagccgtcgacgccgcc 

(SEQIDNO:32) 

Accggt 



iegrgiprnsrvdaa 
(SEQIDNO:27) 



GFP-DhaA 



sgsslgdevdggpsrat 
(SEQEDNO:29) 



DhaA-RIuc 



tgsgsslryrgpsra 
(SEQIDNO:31) 



Rluc-DhaA 



sgsslryrgpsravdaa 
(SEQIDNO:33) 



DhaA-Flag 



Tg 



10 



15 



SDS-P AGE followed by Western blot analysis showed that the proteins 
had their predicted molecular weights and were recognized by anti-RXuc and 
anti-Flag 11 M2 antibodies. In addition, all fusion proteins had Renilla luciferase 
activity (as determined by Promega's Renilla Luciferase Assay System in PBS 
pH 7.4 buffer). 

Tethering of proteins to a solid support via a DhaA.H272F-Cl-alkane 
bridge was shown by using biotin-Ci8H 32 04-Cl as a substrate and streptavidin 
(SA)-coated 96 well plates (Pierce, USA) as solid support. Translated proteins 
were contacted with biotin-CigHbCVCl substrate at 25 |jM (final concentration), 
for 60 minutes at room temperature. Unbound biotm-Ci 8 H3204-Cl was removed 
by gel-filtration on Sephadex G-25 prepackaged columns (Amersham 
Biosciences). Collected fractions of RXuc-connector-DhaA fusions were placed 
in SA-coated 96-well plate for 1 hour at room temperature, unbound proteins 
were washed out and luciferase activity was measured. 

Figure 13A shows Renilla luciferase activity captured on the plate. 
Analysis of these data indicated that only the fusion containing the mutant DhaA 
was captured. The efficiency of capturing was very high (more than 50% of 
Renilla luciferase activity added to the plate was captured). In contrast, the 
efficiency of capturing of fusions containing wild-type DhaA as well as Renilla 
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luciferase was negligibly small (< 0.1%). Pretreatment of RXuc-connector- 
DhaA.H272F with a non-biotinylated substrate (TAMRA-C14H24O4-CI) 
decreased the efficiency of capturing by about 80%. Further, there was no effect 
of pretreatment with a nonbiotinylated substrate on the capturing of the R.Luc- 
5 connector-DhaA.WT or Renilla luciferase. 

Taken together, these data demonstrate that active enzymes (e.g., Renilla 
luciferase) can be tethered to a solid support that forms part of a substrate of the 
invention (Cl-alkane-DhaA.H272F-bridge), and retain enzymatic activity. 

10 Example IV 

Mutant DhaA and Substrate System In Vivo 
A. Covalent Tethering of Functional Groups to DhaA Mutants In Vivo: in 
Prokarvotes and Eukarvotes 
Materials and Methods 

15 To study the binding of a substrate of the invention to a mutant hydrolase 

expressed in prokaryotes, E. coli cells BL21 (JJDE3) pLys65 were transformed 
with pGEX-5X-3.DhaA.WT-Flag or pGEX-5X-3.DhaA.H272F-Flag, grown in 
liquid culture, and induced with IPTG. Either TAMRA-C14H24O4-CI or biotin- 
C18H32O4-CI was added to the induced cells (final concentration, 25 pM). After 

20 1 hour, cells were harvested, washed with cold PBS (pH 7.3), disrupted by 

sonication, and fractionated by centrifugation at 19,800 x g for 1 hour. Soluble 
fractions were subjected to SDS-PAGE. Gels with proteins isolated from cells 
treated with TAMRA-C14H24O4-CI were analyzed on a fluoroimager, while 
proteins from cells treated with biotin-Ci8H320 4 -Cl were transferred to a 

25 nitrocellulose membrane and probed with HRP-conjugated streptavidin. 

To study the binding of TAMRA-Q4H24O4-CI in mammalian cells, 
DhaA.WT-Flag and DhaA.H272F~Flag coding regions were excised from 
pGEX-5X-3.DhaA.WT-Flag or pGEX-5X-3.DhaA.H272F-Flag, respectively, 
gel purified, and inserted into SaWNotl restriction sites of pCIneo.CMV vector 

30 (Promega). The constructs were confirmed by DNA sequencing. 

CHO-K1 cells were plated in 24 well plates (Labsystems) and transfected 
with a pCIneo-CMV.DhaA.WT-Flag or pCIneo-CMV.DhaA.H272F-Flag vector. 
Twenty-four hours later, media was replaced with fresh media containing 25 \iM 
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TAMRA-C14H24O4-CI and the cells were placed into a CO2 incubator for 60 
minutes. Following this incubation, media was removed, cells were quickly 
washed with PBS (pH 7.4; four consecutive washes: 1 .0 ml/cm 2 ; 5 seconds each) 
and the cells were solubilized in a sample buffer (1% SDS, 10% glycerol, and 
5 the like; 250 ^il/well). Proteins (10 ^il/lane) were resolved on SDS-PAGE (4- 
20% gradient gels) and the binding of the TAMRA-Q4H24O4-CI was detected by 
a fluoroimager (Hitachi, Japan) at Eex/Ecm equal 540/575 nm. 
Results 

Figures 14A and B show the binding of biotin-Cigl^CVCl (A) and 

1 0 TAMRA-C12H24O4-CI (B) to E. coli proteins in vivo. The low molecular band 
on Figure 14A is an E. coli protein recognizable by HRP-SA, while the 
fluorescence detected in the bottom part of Panel B was fluorescence of free 
TAMRA-C12H24O4-CL Figure 15 shows the binding of TAMRA-C12H24O4-CI to 
eukaryotic cell proteins in vivo. 

1 5 Analysis of Figure 14 and Figure 1 5 showed that the DhaA.H272F-Flag 

mutant but not DhaA.WT-Flag binds TAMRA-C14H24O4-CI or biotin-Ci 8 H 3 20 4 - 
Cl in vivo. Moreover, the bond between DhaA.H272F-Flag and the substrate 
was very strong (probably covalent), since boiling with SDS followed by SDS- 
PAGE did not disrupt the bond between the mutant enzyme and the substrate. 

20 B. Permeability of Cell Membrane to Substrates of the Invention 
Materials and Methods 

CHO-K1 Cells (ATCC-CCL61) were cultured in a 1 :1 mixture of Ham's 
F12 nutrients and Dulbecco's modified minimal essential medium supplemented 
with 10% fetal bovine serum (FBS), 100 U/ml penicillin, and 100 mg/ml 

25 streptomycin, in an atmosphere of 95% air and 5% CO2 at 37°C. 

To study uptake of different substrates, cells were plated in LT-II 
chambers (Nunc) or 96 well plates (Labsystems) at a density of 30,000 cells/cm 2 . 
The following day, media was replaced with media containing different 
concentrations of the substrates and cells were placed back in a CO2 incubator 

30 for 2, 5 or 15 minutes. At the end of the incubation, media containing substrate 
was removed and cells were quickly washed with PBS (pH 7.4; four consecutive 
- washes: 1 .0 ml/cm 2 ; 5 seconds each). Fresh media was then added to cells, and 
the cells were returned to the CO2 incubator at 37°C. The level of fluorescence 
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in cells in 96 well plates was measured on fluorescent plate reader CytoFluor II 
(Beckman) at Eex/Eem equal 480/520 nm and 540/575 nm for FAM- and 
TAMRA-modified substrates, respectively. Fluorescent images of the cells were 
taken on inverted epifluorescent microscope Axiovert-100 (Carl Zeiss) with 
5 filter sets appropriate for detection of FITC and TAMRA. 
Results 

As shown in Figure 16, CHO-K1 cells treated with TAMRA-C^gCV 
Cl (25 jliM, 5 minutes at 37°C) could be quickly and efficiently loaded with 
TAMRA-C14H28O4-CI. Image analysis indicated that the fluorescent dye crossed 
1 0 the cell membrane. Figure 16 also shows that TAMRA-C14H28O4-CI could be 
efficiently washed out of the cells. Taken together these data indicate that the 
plasma membrane of CHO-K1 cells is permeable to TAMRA-C14H28O4-CI. 

In contrast, FAM-C14H24O4-CI did not cross the plasma membrane of 
CHO-K1 cells, even when cells were pretreated with FAM-C14H24O4-CI at high 
1 5 concentrations (i.e., 100 |nM) and for much longer periods of time (60 minutes) 
(data not shown). Thus, the different permeabilities of the cell plasma 
membrane for various substrates of the invention, e.g., TAMRA-Ci 4 H 24 04-Cl 
and FAM-C14H24O4-CI, provides a unique opportunity to label proteins 
expressed on the cell surface and proteins expressed inside the cell with different 
20 fluorophores, thereby allowing biplexing. 

Example V 

DhaA-based Tethering for Cell Imaging In Vivo 
A. Colocalization of GFP and TAMRA-C ioHoa( VC1 in Living Mammalian Cells 

25 Materials and Methods 

A GFP-connector-DhaA fusion cassette was constructed by replacing the 
Renilla luciferase coding region in Packard's vector coding GFP-DEVD-Rluc(h) 
(Packard #63 10066) with DhaA.WT-Flag or DhaA.H272F-Flag coding regions. 
Two primers (5'- GGAATGGGCCCTCTAGAGCGACGATGTCA -3'; SEQ ID 

30 NO: 1 5, and 5'- CAGTCAGTCACGA TGGATCC GCTC AA -3'; SEQ ID 

NO:16) were designed to addApal and BamHl sites (underlined) to the 5' and 3' 

coding regions of DhaA, respectively, and to amplify a 980 bp fragment from a 

pGEX-5X-3.DhaA.WT-Flag or pGEX-5X-3.DhaA.H272F-Flag template. The 
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R.Luc coding region was excised with Apal and BamHl restriction enzymes. 
Then the 980 bp fragment containing DhaA was inserted into the ApaVBamHl 
site of the GFP-DEVD-Rluc(h) coding vector. The sequence of the gene fusion 
constructs was confirmed by DNA sequencing. 
5 Cells transiently expressing GFP-connector-DhaA.WT-Flag or GFP- 

connector-DhaA.H272F-Flag fusion proteins were plated in LT-II chambers 
(Nunc) at a density of 30,000 cells/cm 2 . The next day, media was replaced with 
fresh media containing 25 [iM of TAMRA-C14H24O4-CI and the cells were 
placed back into in a CO2 incubator for 60 minutes. At the end of the 

10 incubation, media containing substrates was removed, cells were quickly washed 
with PBS (pH 7.4; four consecutive washes: 1.0 ml/cm 2 ; 5 seconds each) and 
new media was added to the cells. The cells were placed back into in a CO2 
incubator and after 60 minutes the cells were quickly washed with PBS (pH 7.4; 
four consecutive washes: 1.0 ml/cm 2 ; 5 seconds each). Fluorescent images of 

15 the cells were taken on inverted epifluorescent microscope Axiovert-100 (Carl 
Zeiss) with filter sets appropriate for detection of GFP and TAMRA. 
Results 

As shown by the images in Figure 17, cells transfected with either GFP- 
connector-DhaA.WT-Flag or GFP-connector-DhaA.H272F-Flag showed robust 

20 expression of the protein(s) with light emitting characteristics of GFP. Analysis 
of the images of the same cells taken with a TAMRA-filter set showed that cells 
expressing GFP-connector-DhaA.WT-Flag were dark and could not be 
distinguished from cells that do not express this fusion protein. In contrast, cells 
expressing GFP-connector-DhaA.H272F-Flag were very bright and 

25 unmistakably recognizable. 

Western blot analysis of proteins isolated from CHO-K1 cells transfected 
with GFP-connector-DhaA.WT-Flag or GFP-connector-DhaA.H272F-Flag 
vectors showed that these cells expressed proteins that were recognized by an 
anti-Flag antibody and had the predicted molecular weight for the fusion proteins 

30 (data not shown). A fluoroscan of the SDS-PAGE gel with these proteins 

showed strong/covalent binding of TAMRA to GFP-connector-DhaA.H272F- 
Flag and no binding to GFP-connector-DhaA.WT-Flag (Figure 1 8). 
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B. Fusion Partners of DhaA in DhaA.WT-Flag and DhaA.H272F-Flag are 
Functional 

To determine whether fusion of two proteins leads to the loss of the 
activity of one or both proteins, several DhaA-based fusion proteins (see Table 
5 II) with DhaA at the C- or N-terminus of the fusion and a connector sequence, 
e.g., one having 13 to 17 amino acids, between the two proteins, were prepared. 
The data showed that the functional activity of both proteins in the fusion was 
preserved. 

10 Table n 



N-Terminal 
protein 


Connector 


C-terminal 
protein 


Function of 
protein #1 


Function of 
protein #2 


GST 


+ 


DhaA.H272F 


Binding to GSS 
column 


binding 


GFP 




DhaA.H272F 


Green 

fluorescence 


binding 


R.Luc 




DhaA.H272F 


hydrolysis of 
coelenterazine 


binding 


DhaA.H272F 




RXuc 


Binding 


hydrolysis of 
coelenterazine 


DhaA.H272F 




Flag 


binding 


Recognized by 
antibody 



C. Toxicity of CI- Alkanes 
Materials and Methods 

To study the toxicity of Cl-alkanes, CHO-K1 cells were plated in 96 well 

1 5 plates to a density of 5,000 cells per well. The next day, media was replaced 

with fresh media containing 0-100 pM concentrations of Cl-alkanes and the cells 
were placed back into a C0 2 incubator for different periods of time. Viability of 
the cells was measured with CellTiter-Glo™ Luminescence Cell Viability Assay 
(Promega) according to the manufacturer's protocol. Generally, 100 pi of 

20 CellTiter-Glo™ reagent was added directly to the cells and the luminescence 

was recorded at 10 minutes using a DYNEX MLX microtiter plate luminometer. 

In some experiments, in order to prevent fluorescence/luminescence 

interference, the media containing fluorescent Cl-alkanes was removed and the 

cells were quickly washed with PBS (pH 7.4; four consecutive washes: 1.0 

25 ml/cm 2 ; 5 seconds each) before addition of CellTiter-Glo™ reagent. Control 
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experiments indicated that this procedure had no effect on the sensitivity or 

accuracy of the CellTiter-Glo™ assay. . 

Results 

As shown in Figure 19, TAMRA-C14H24O4-CI showed no toxicity on 
5 CHO-K1 cells even after a 4 hour treatment at a 100 pM concentration the (the 
highest concentration tested). After a 24 hour treatment, no toxicity was 
detected at concentrations of 6.25 |iM (the "maximum non-toxic 
concentration")- At concentrations > 6.25 pM, the relative luminescence in 
CHO-K1 cells was reduced in a dose-dependent manner with an IC50 of about 

10 100 \xM. No toxicity of biotin-CigH^CVCl was observed even after 24 hours of 
treatment at 100 pM. In contrast, ROX5-C14H24O4-CI had a pronounced toxic 
effect as a reduction of the RLU in CHO-K1 cells could be detected after a 1 
hour treatment. The IC50 value of this effect was about 75 \jM with no apparent 
ATP reduction at a 25 concentration. The IC50 value of ROX5-C14H24O4-CI 

15 toxicity and the "maximum non-toxic concentration" of ROX5-C14H24O4-CI 
decreased in a time-dependent manner reaching 12.5 (iM and 6.25 pM, 
respectively. 

D. Detection of DhaA.D106C in CHO cells contacted with TAMRA- or DiAc- 
FAM-containing substrates and a fixative 

20 CHO cells (ATCC, passage 4) were seeded into 8-well chamber slides 

(German coverglass system) at low density in DMEM:F12 media (Gibco) 
containing 10% FBS and 1 mM glutamine (growth media) without antibiotics. 
Two days later, cells were inspected using an inverted phase microscope. Two 
visual criteria were confirmed before applying the transfection reagents: 1) the 

25 level of cellular confluence per chamber was approximately 60-80%, and 2) 

>90% of the cells were adherent and showed a flattened morphology. The media 
was replaced with 150 j-il of fresh pre-warmed growth media and cells were 
incubated for approximately 1 hour. 

Cells were transfected using the Transit TKO system (Miris). The TKO 

30 lipid was diluted by adding 7 jlU of lipid per 1 00 \x\ of serum-free DMEM.F12 
media, and then 1 .2 jag of transfection-grade DhaA.D106C DNA was added per 
100 jJ of lipid containing media. The mixture was incubated at room 
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temperature for 15 minutes, and then 25 fil aliquots were transferred into 
individual culture chambers (0.3 \ig DNA). Cells were returned to the incubator 
for 5-6 hours, washed two times with growth media, 300 \xl of fresh growth 
media was added, and then cells were incubated for an additional 24 hours. 
5 Transfected or non-transfected control cells were incubated with 12.5 

TAMRA-C J4 H240 4 -C1 or 12.5 DiAc-FAM-Ci^CVCl in 10% 
FBS/DMEM for 30 minutes at 37°C and 5%C0 2 . Cells were washed with warm 
growth media three times, 300 jxl fresh growth media was added, and then cells 
were incubated for 1 hour. 
10 Growth media was replaced with warm PBS and live cells were 

visualized using a Zeiss Axiovert 100 inverted microscope equipped with a 
rhodamine filter set (Exciter filter= 540, Emission filter= 560LP) and a 
fluorescein filter set (Exciter filter= 490, Emission filter= 520), and a Spot CCD 
camera. Images were captured with exposure times of 0.15-0.60 seconds at gain 
15 settings of 4 or 16. 

Discreet and specifically labeled transfected cells were evident in both 
TAMRA-C 14 H 2 40 4 -C1 and DiAc-FAM-Ci 4 H 2 40 4 -Cl labeled cells. The majority 
of cells were non-transfected cells and they did not retain the label. 

The PBS was removed and cells were fixed with 3.7% 
20 paraformaldehyde/0. 1 % Triton in PBS for 1 5 minutes. The fixative was 

removed, PBS was added, and a second set of images was captured for both 
TAMRA-Q4H24O4-CI and DiAc-FAM-Ci4H 2 40 4 -Cl labeled cells. 

The PBS was replaced with 50% methanol in PBS and cells were 
incubated for 15 minutes, followed by a 15 minute incubation in 95% methanol. 
25 A third set of images was captured and then an equal volume mixture of 

methanol and acetone was applied to the cells and incubated for 15 minutes. The 
media was replaced with PBS and a fourth set of images was collected. 

Results suggested that the binding of the substrates to the DhaA.D106C 
mutant was stable following fixation with paraformaldehyde and subsequent 
30 processing of fixed cell samples in methanol and acetone. Furthermore, the 
brightness of the TAMRA or FAM fluorescence was unchanged under these 
conditions. 
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Example VI 

Mutant Beta-Lactamase (blaZVbased Tethering 
The serine-P-lactamases, enzymes that confer bacterial resistance to p- 

5 lactam antibiotic, likely use the hydroxyl group of a serine residue (Ser70 in 
the class A consensus numbering scheme of Ambler et al. (1991)) to degrade a 
wide range of P-lactam compounds. The reaction begins with the formation of 
a precovalent encounter complex (Figure 20A), and moves through a high- 
energy acylation tetrahedral intermediate (Figure 20B) to form a transiently 

10 stable acyl-enzyme intermediate, forming an ester through the catalytic residue 
Ser70 (Figure 20C). Subsequently, the acyl-enzyme is attacked by hydrolytic 
water (Figure 20D) to form a high-energy deacylation intermediate (Figure 
20E) (Minasov et al., 2002), which collapses to form the hydrolyzed product 
(Figure 20F). The product is thai expelled, regenerating free enzyme. As in 

1 5 serine proteases, this mechanism requires a catalytic base to activate the serine 
nucleophile to attack the amide bond of the substrate and, following formation 
of the acyl-enzyme intermediate, to activate the hydrolytic water for attack on 
the ester center of the adduct. 
A. Mutant B-Lactamase and Fusions Thereof 

20 Materials and Methods 

The plasmid pTS32 harboring Staphylococcus aureus PCI blaZ gene 
(Zawadzke et al., 1995) was kindly provided by Dr. O. Herzberg (University of 
Maryland Biotechnology Institute). The blaZ gene has the following sequence: 
AGCTTACTAT GCCATTATTA ATAACTTAGC CATTTCAACA 

25 CCTTCTTTCA AATATTTATAATAAACTATT GACACCGATA 
TTAC AATTGT AATATTATTG ATTT ATAAAA 
ATTACAACTGTAATATCGGA GGGTTTATTT TGAAAAAGTT 
AATATTTTTA ATTGTAATTG CTITAGTTTTAAGTGCATGT 
AATTCAAACA GTTC ACATGC CAAAGAGTTA AATGATTTAG 

30 AAAAAAAATATAATGCTCAT ATTGGTGTTT ATGCTTTAGA 
TACTAAAAGT GGTAAGGAAG TAAAATTTAATTCAGATAAG 
AGATTTGCCT ATGCTTCAAC TTCAAAAGCG ATAAATAGTG 
CTATTTTGTTAGAACAAGTA CCTTATAATA AGTTAAATAA 
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AAAAGTACAT ATTAACAAAG ATGATATAGTTGCTTATTCT 
CCTATTTTAG AAAAATATGT AGGAAAAGAT ATCACTTTAA 
AAGCACTTATTGAGGCTTCA ATGACATATA GTGATAATAC 
AGCAAACAAT AAAATTATAA AAGAAATCGGTGGAATCAAA 
5 AAAGTTAAAC AACGTCTAAA AGAACTAGGA GATAAAGTAA 
CAAATCCAGTTAGATATGAG ATAGAATTAA ATTACTATTC 
ACCAAAGAGC AAAAAAGATA CTTCAACACCTGCTGCCTTC 
GGTAAGACCC TTAATAAACT TATCGCCAAT GGAAAATTAA 
GCAAAGAAAACAAAAAATTC TTACTTGATT TAATGTTAAA 

10 TAATAAAAGC GGAGATACTT TAATTAAAGACGGTGTTCCA 
AAAGACTATA AGGTTGCTGA TAAAAGTGGT CAAGCAATAA 
CATATGCTTCTAGAAATGAT GTTGCTTTTG TTTATCCTAA 
GGGCCAATCT GAACCTATTG TTTTAGTCATTTTTACGAAT 
AAAGACAATA AAAGTGATAA GCCAAATGAT AAGTTGATAA 

1 5 GTGAAACCGCCAAGAGTGTA ATGAAGGAAT TTTAATATTC 
TAAATGCATA ATAAATACTG ATAACATCTTATATTTTGTA 
TTATATTTTG TATTATCGTT GAC (SEQ ID NO:36). 

GST-MaZ (WT and E166D, N170Q, or E166D:N170Q mutants) fusion 
cassettes were constructed by introducing point mutations into the blaZ gene and 

20 cloning the blaZ coding regions into SaWAgel sites of pGEX5x3 vector. The 
internal mutagenic primers were as follows: E166D (5'- 
CCAGTTAGATATGACATAGAATTAAATTACTATTCACC-3', SEQ ID 
NO:56; 5 ' -GGTGAATAGTAATTTAATTCTATGTCATATCTAACTGG-3 
SEQ ID NO:57); N170Q (5'- 

25 CCAGTTAGATATGAGATAGAATTACAGTACTATTCACC-3', SEQ ID 
NO:58; and 5'-GGTGAATAGTACTGTAATTCTATCTCATATCTAACTGG- 
3', SEQ ID NO:59); and E166D:N170Q 

(5'CCAGTTAGATATGACATAGAATTACAGTACTATTCACC-3'; SEQ ID 
NO:60 and 5'-GGTGAATAGTACTGTAATTCTATGTCATATCTAACTGG-3; 
30 SEQ ID NO:61). Two external primers (5'- 

CAACAGGTCGACGCCGCCATGAAAGAGTTAAATGATTTAG-3', SEQ ID 

NO:62; and 5'-GTAGTCACCGGTAAATTCCTTCATTACACTCTTGGC-3', 

SEQ ID NO:63) were designed to add N-terminal Sail site and a Kozak sequence 
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to the 5' coding region, add an Agel site to the 3 r coding regions of WaZ, and to 
amplify a 806 bp fragment from a blaZ.WT template. The resulting fragment 
was inserted into the SaWAgel site of the vector pGEX-5X-3 containing a 
glutathione S-transferase (GST) gene, a sequence coding a Factor Xa cleavage 
5 site, and multiple cloning sites (MCS) followed by a sequence coding for Flag 
and stop codons. These gene fusion constructs were confirmed by DNA 
sequencing. 

The GST-Ma Z (WT or mutants) fusion proteins were overexpressed in 
competent E. coli BL21 (X, DE3) cells and purified essentially as described for 

1 0 DhaA and GST-DhaA fusion proteins (except the potassium phosphate buffer 
(0.1 M, pH 6.8) was used instead of Buffer A). Homogeneity of the proteins 
was verified by SDS-PAGE. 

The chromogenic substrate 6-P-[(FuryIacryIoyl)amido]penicillanic acid 
triethylamine salt (FAP) was purchased from Calbiochem (La Jolla, CA). 

1 5 Hydrolysis of FAP was monitored by loss of adsorbance at 344 nm (deltaE = 
1330 M" 1 cm" 1 ) on a Beckman Du640 spectrophotometer (Beckman Coulter, 
Fullerton, CA). All assays were performed at 25°C in 0. 1 M potassium 
phosphate buffer at pH 6.8. 

In CCF2, the cephalosporin core links a 7-hydroxycoumarin to a 

20 fluorescein. In the intact molecule, excitation of the coumarin (E^- 409 nm) 
results in FRET to the fluorescein, which emits green light (E em - 520 nm). 
Cleavage of CCF2 by P-lactamase results in spatial separation of the two dyes, 
disrupting FRET such that excitation of coumarin now gives rise to blue 
fluorescence (Ee*- 447 nm). CCF2 was purchased from Aurora Biosciences 

25 Corporation (San Diego, CA). Reduction of the FRET signal and an increase in 
blue fluorescence were measured on Fluorescence Multi-well Plate Reader 
CytoFluorll (PerSeptive Biosystems, Framingham, MA, USA). 
Results 

All P-lactamases, including p-lactamase from Staphylococcus aureus 
30 PCI, hydrolyze P-lactams of different chemical structure. The efficiency of . 
hydrolysis depends on the type of the enzyme and chemical structure of the 
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substrate. Penicillin is considered to be a preferred substrate for P-lactamase 
from Staphylococcus aureus PCI. 

The effect of point mutation(s) on the ability of p-lactamase to hydrolyze 
penicillins was studied as described in Zawadzke et al. (1995). As shown in 
5 Figure 20, a GST-p-lactamase PCI fusion protein efficiently hydrolyzed FAP. 
Hydrolysis of FAP by 6&Z.E166D, blaZmiOQ or WaZ.E166D:N170Q blaZ 
mutants could not be detected even after 60 minutes of co-incubation. 
Therefore, these mutations lead to significant inactivation ofblaZ. 

To show that WaZ.E166D, 6/aZ.N170Q, or WflZ.E166D:N170Q mutants 
1 0 bind p-lactams, and therefore different functional groups could be tethered to 
these proteins via p-lactams, GST fusions of these mutants were incubated with 
BOCELLIN™ FL, a fluorescent penicillin (Molecular Probes Inc., Eugene, OR). 
Proteins were resolved on SDS-PAGE and analyzed on fluoroimager (Hitachi, 
Japan) at an E ex /Eem appropriate for the particular fluorophore. The data in 
1 5 Figure 22 show that all blaZ mutants bind bocellin. Moreover, the bond 

between blaZ mutants and fluorescent substrates was very strong, and probably 
covalent, since boiling with SDS followed by SDS-PAGE did not disrupt the 
bond. Also, the binding efficiency of double mutant WaZ.E166D:N170Q 
(judged by the strength of the fluorescent signal of protein-bound fluorophore) 
20 was much higher than binding efficiency of either of the single mutants, and the 
binding efficiency of WaZ.N170Q was higher than binding efficiency of 
5/aZ.El 66D. These data, in combination with current understanding of the role 
of the individual amino acids in hydrolysis of beta-lactams, show that additional 
mutations (e.g., a mutation of an auxiliary amino acid) can improve efficiency of 
25 tethering of functional groups to a mutated protein. 

The effect of point mutation(s) on the ability of P-lactamase to hydrolyze 
cephalosporins was also studied using CCF2, a FRET-based substrate described 
by Zlokarnik et al. (1998). As shown in Figure 23, the GST-p-lactamase PCI 
fusion protein efficiently hydrolyzed CCF2 (lane 2). Single point mutations 
30 (i.e., E166D or N170Q) reduced the ability of the fusion proteins to hydrolyze 
CCF2 (lanes 3 and 4). The replacement of two amino acids 
(WaZ.E166D:N170Q mutants, lane 5) had an even more pronounced effect on 
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the CCF2 hydrolysis. However, all blaZ mutants were capable of hydrolyzing 
CCF2. 

Thus, an amino acid substitution at position 166 or 170, e.g., Glul66Asp 
or Asnl70Gly enables the mutant beta-lactamase to trap a substrate and therefore 
5 tether the functional group of the substrate to the mutant beta-lactamase via a 
stable, e.g., covalent, bond. Moreover, mutation of an amino acid that has an 
auxiliary effect on H2O activation increased the efficiency of tethering. 

Example VII 

10 Targeting of DhaA.H272F to the Nucleus and Cvtosol of Living Cells 

Materials and Methods 

A GFP-connector-DhaA.H272F-NLS3 fusion cassette was constructed 
by inserting a sequence encoding NLS3 (three tandem repeats of the Nuclear 
Localization Sequence (NLS) from simian virus large T-antigen) into the 
1 5 Agel/BamEl sites of a pCIneo.GFP-connector-DhaA.H272F-Flag vector. Two 
complementary oligonucleotides (5- 

CCGGTGATCCAAAAAAGAAGAGAAAGGTAGATCCAAAAAAGAAGAG 
AAAGGTAGATCCAAAAAAGAAGAGAAAGGTATGAG -3', sense, SEQ ID 
NO:37,and5'- 

20 GATCCTCATACCTTTCTCTTClTr 

TGGATCTACCTTTCTCTTCTTTTTTGGATCA -3 r , antisense, SEQ ID 
NO:38) coding for the NLS3 peptide, were annealed. The annealed DNA had an 
Agel site at 5' end and a BamUl site at the 3' end. The annealed DNA was 
subcloned into the GFP-connector-DhaA.H272F-Flag construct at the 

25 AgeVBamHI sites. The sequence of the gene fusion construct was confirmed by 
DNA sequencing. 

A DhaA.H272F-P-arrestin2 fusion cassette was constructed by replacing 
the pGFP 2 coding region in Packard's vector encoding GFP 2 -P-arrestin2 
(Packard #6310176-1F1) with the DhaA.H272F-Flag coding region. Two 
30 primers (5- ATTATGCTGAGTGATATCCC-3 1 ; SEQ ID NO:39, and 5 ! - 

CTCGGTACCAAGCTCCTTGTAGTCA-3 , ; SEQ ID NO:40) were designed to 
add a Kpnl site to the 3 f coding region of DhaA, and to amplify a 930 bp 
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fragment from a pGEX5X-3.DhaA.H272F-Flag template. The pGFP 2 coding 
region was excised with Nhel and Kpnl restriction enzymes, then the 930 bp 
fragment containing encoding DhaA.H272F was inserted into the Nhel and Kpnl 
sites of the GFP 2 -P-arrestin2 coding vector. The sequence of the fusion 
5 construct was confirmed by DNA sequencing. 

CHO-K1 or 3T3 cells transiently expressing GFP-connector- 
DhaA.H272F-NLS3, GFP 2 -p-arrestin2 or DhaA.H272F-p-arrestin2 fusion 
proteins were plated in LT-II chambers (Nunc) at a density of 30,000 cells/cm 2 . 
The next day, media was replaced with fresh media containing 25 pM of 

10 TAMRA-C14H24O4-CI and the cells were placed back into a C0 2 incubator for 
60 minutes. At the end of the incubation, substrate media was removed, cells 
were quickly washed with PBS (pH 7.4; four consecutive washes: 1.0 ml/cm 2 ; 5 
seconds each), and new media was added to the cells. The cells were placed 
back into a C0 2 incubator and after 60 minutes the cells were quickly washed 

15 with PBS (pH 7.4; 1 .0 ml/cm 2 ). Fluorescent images of the cells were taken on 
confocal microscope Pascal-5 (Carl Zeiss) with filter sets appropriate for the 
detection of GFP and TAMRA. 
Results 

As shown by the images in Figure 24, GFP and TAMRA were co- 
20 localized in the cell nucleus of cells expression GFP-connector-DhaA.H272F- 
NLS3 and contacted with TAMRA-C14H24O4-CI. 

As shown by the images in Figure 25, GFP-p-arrestin2 expressing cells 
have a typical P-arrestin2 cytosolic localization. A fluoroscan of the SDS-PAGE 
gel of DhaA.H272F-p-arrestin2 showed strong binding of a TAMRA containing 
25 DhaA substrate to cells expressing DhaA.H272F-p-airestin2. 



Example VIH 
Site-Directed Mu tagenesis of DhaA Catalytic Residue 130 
Haloalkane dehalogenases use a three-step mechanism for cleavage of 
30 the carbon-halogen bond. This reaction is catalyzed by a triad of amino acid 
residues composed of a nucleophile, base and acid which, for the haloalkane 
dehalogenase from Xanthobacter autotrophicus (DhlA), are residues Aspl24, 



98 



WO 2004/072232 



PCT/US2004/002607 



His289 and Asp260, respectively (Franken et al., 1991), and in Rliodococcus 
dehalogenase enzyme (DhaA), Aspl06, His272 and Glul30 (Newman et al., 
1999). 

Unlike the haloalkane dehalogenase nucleophile and base residues, the 

5 role of the third member of the catalytic triad is not yet folly understood. The 
catalytic acid is hydrogen bonded to the catalytic His residue and may assist the 
His residue in its function by increasing the basicity of nitrogen in the imidazole 
ring. Krooshof et al. (1997), using site-directed mutagenesis to study the role of 
the DhlA catalytic acid Asp260, demonstrated that a D260N mutant was 

10 catalytically inactive. Furthermore, this residue apparently had an important 
structural role since the mutant protein accumulated mainly in inclusion bodies. 
The haloalkane dehalogenase from Sphinogomonas paucimobilis (LinB) is the 
enzyme involved in y-hexachlorocyclohexane degradation (Nagata et al., 1997). 
Hynkova et al., (1999) replaced the putative catalytic residue (Glu-132) of the 

15 LinB with glutamine (Q) residue. However, no activity was observed for the 
E132Q mutant even at very high substrate conpentrations. 

To examine the role of the DhaA catalytic triad acid Glul30 in protein 
production and on the ability of the mutant protein to form covalent alkyl- 
enzyme intermediates with a fluorescent-labeled haloalkane substrate, site- 

20 directed mutagenesis was employed to replace the DhaA glutamate (E) residue at 
position 130 with glutamine, leucine and alanine. 
Materials and Methods 

Strains and plasmids. Ultracompetenti?. coli XL10 Gold (Stratagene; 
Tef A(mcrA)183 A(mcrCB-JisdSMR-mrr)173 endAl supE44 thi-1 recAl gyrA96 

25 relAl lac Hte [F proAB lad q ZAM15 TniO (Tet 1 ) Amy Cam 1 ]) was used to as a 
host in transformation of site-directed mutagenesis reactions. E, coli strain 
JM109 (el4-(McrA~) recAl endAl gyrA96 thi-1 hsdR17(tK- mK+) supE44 
relAl A(lac-proAB) [F' traD36 proAB lacfZAM15J) was used as the host for 
gene expression and whole cell enzyme labeling studies. A GST-DhaA-FLAG 

30 gene fusion cloned into plasmid pGEX5X3, designated 

pGEX5X3DhaAWT.FLAG, was used as the starting template for El 30 
mutagenesis. A mutant plasmid containing a H272F mutation in DhaA, 
designated pGEX5X3DhaAH272F-FLAG, was used as a positive control in 
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labeling studies and the cloning vector pGEX5X3 was used as a negative 
control. 

Site-directed mutagenesi s of the DhaA El 30 residue. The sequence of 
the oligonucleotides used for mutagenesis is shown below. The underlined 
5 nucleotides indicate the position of the altered codons. The oligonucleotides 
were synthesized by Integrated DNA Technologies (Coralville, IA) at the 100 
nmole scale and modified by phosphorylation at the 5' end. 

DhaAE130Q 5' 

10 CAAAGGTATTGCATGTATGCAGTTCATCCGGCCTATCCCG 3' (SEQ ID 
NO:41) 

DhaAE130L 5' 

GTCAAAGGTATTGCATGTATGCTGTTCATCCGGCCTATCCCGAC3' 
15 (SEQIDNO:42) 

DhaA E130A 5 1 AGGTATTGCATGTATGGCGTTCATCCGGCCTATCCC 3* 
(SEQIDNO:43) 

20 Site-directed mutagenesis was performed using the QuikChange Multi kit 
according to the manufacturer's instructions (Stratagene, La Jolla, CA). The 
mutagenesis reactions were introduced into competent £. coli XL10 Gold cells 
and transformants were selected on LB agar plates containing ampicillin (100 
Hg/mL). Plasmid DNA isolated from individual transformants was initially 

25 screened for the loss of an EcdRI site due to replacement of the glutamate codon 
(GAAttc). Clones suspected of containing the desired codon change from each 
reaction were selected and subjected to DNA sequence analysis (SeqWright, 
Houston, TX). The primer used to confirm the sequence of the mutants in the 
pGEX5X3 vector was as follows: 5" GGGCTGGCAAGCCACGTTTGGTG 3' 

30 (SEQIDNO:44). 

DhaA mutant analysis The three DhaA E130 substitution mutants were 
compared to the following constructs: Wild-type DhaA, DhaA.H272F, and a 
DhaA negative control (pGEX5X3 vector only). Overnight cultures of each 
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clone were grown in 2 mL of LB containing ampicillin (100 ng/taL) by shaking 
at 30°C. The overnight cultures were diluted 1 :50 into a sterile flask containing 
50 mL fresh LB medium and ampicillin (100 ng/mL). The cultures were 
incubated with shaking at 25°C to minimize the production of insoluble protein 

5 species. When the cultures reached mid-log phase (OD6oo=0.6), IPTG (0.1 mM) 
was added and the cultures were incubated with shaking at 25°C for an 
additional 22 hours. For labeling of whole cells with a tetramethylrhodamine 
(TAMRA) haloalkane conjugated substrate, the cell density of each culture was 
adjusted to ODeoo^l prior to adding substrate to a concentration of 1 5 \xM. The 

10 cells were incubated with gentle agitation at 4°C for approximately 1 8 hours. 
Following incubation, 20 ^1 of cells from each labeling reaction was added to 6 
Hi of 4X SDS loading dye and the samples were boiled for about 3 minutes prior 
to being loaded onto a 4-20% acrylamide gel (Tris glycine). For in vitro labeling 
studies, crude lysates of IPTG induced cultures were prepared by collecting 3 

15 mL of cells (OD6oo=l) and resuspending the resulting pellet in 75 |xL PBS. 
Following a freeze/thaw step, 225 nL of IX Cell Culture Lysis Reagent 
(Promega Corp., Madison, WI) containing 1.25 mg/mL lysozyme was added to 
facilitate lysis of the cells. A 20 [iL sample of each lysate was combined with 25 
|iL of IX PBS. The TAMRA labeled haloalkane substrate was added to a final 

20 concentration of 25 fiM. The labeling reactions were incubated at room 

temperature for 2 hours. A 25 ^ sample of each labeling reaction was added to 
6 nl of 4X SDS loading dye and the samples were boiled for about 3 minutes 
prior to being loaded onto a 4-20% acrylamide gel (Tris glycine). The gels were 
imaged using a Fluorhnager SI instrument (Amersham Biosciences, Piscataway, 

25 NJ) set to detect emission at 570 nm. 

Cell-free lysates were generated by centrifugation of crude lysates for 15 
minutes at 14,000 RPM. Protein production was monitored by SDS-PAGE and 
Western blot analysis. Proteins transferred to a PVDF membrane were 
incubated with an anti-FLAG R antibody conjugated with alkaline phosphatase 

30 (AP) (Sigma, St. Louis, MO). The blot was developed with the Western Blue 
stabilized substrate for alkaline phosphatase (Promega Corp., Madison, WI). 
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Results 

The role of the DhaA catalytic acid in the hydrolysis of the alkyl-enzyme 
intermediate was probed by site-directed mutagenesis. The DhaA codon E130 
was replaced with a codon for glutamine (Q), leucine (L) or alanine (A), as these 
5 substitutions would likely be least disruptive to the structure of the enzyme. 
Following mutagenesis, restriction endonuclease screening and DNA sequence 
analysis was used to verify the desired codon changes. Sequence verified 
DhaA.E130Q, DhaA.E130L and DhaA.E130A clones, designated CI, A5 and 
A12, respectively, were chosen for further analysis. The E130 mutants were 
10 analyzed for protein expression and for their ability to form a covalent alkyl- 
enzyme intermediate with a TAMRA labeled haloalkane substrate. The three 
E130 gene variants were over-expressed in E. coli JM109 cells following 
induction with IPTG. SDS-PAGE analysis of crude cell lysates showed that 
cultures expressing the wild-type and mutant dhaA genes accumulated protein to 
15 approximately the same level (Figure 26; lanes 2, 4, 6, 8, 10, and 12). 

Furthermore, the DhaA protein that was produced by the wild-type and H272F 
constructs was for the most part soluble since the amount of protein did not 
change appreciably after centrifugation (Figure 26; lanes 3 and 5). The abundant 
22 kDa protein bands present in the vector only lanes (Figure 26; lanes 6 and 7) 
20 represented the GST protein. These results, however, are in stark contrast to the 
DhaA.E130Q, DhaA.E130L and DhaA.E130A mutants that appeared to 
accumulate predominantly insoluble DhaA protein. This conclusion is based on 
the observation that after centrifugation, there was a significant loss in the 
amount of DhaA protein present in cell-free lysates (Figure 26; lanes 9, 1 1, and 
25 13). Nevertheless, a protein band that comigrates with DhaA was clearly 

observed in each DhaA.E130 mutant lanes after centrifugation (+s) suggesting 
the presence of soluble enzyme. Western analysis was, therefore, used to 
determine if the protein bands observed in the DhaA.E130 mutants following 
centrifugation represented soluble DhaA material. The immunoblot shown in 
30 Figure 27 confirmed the presence of soluble DhaA protein in each of the 
DhaA.E130 mutant cell-free lysates (lanes 9, 11, and 13). 

The DhaA.E130 mutants were also examined for their ability to generate 
an alkyl-enzyme covalent intermediate. Crude lysates prepared from IPTG 
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induced cultures of the various constructs were incubated in the presence of the 
TAMRA labeled substrate. Figure 28 showed that the DhaA.H272F mutant 
(lane 3) was very efficient at producing this intermediate. No such product 
could be detected with either the WT DhaA or negative control lysates. Upon 
5 initial examination, the DhaA.El 30 mutants did not appear to produce detectable 
levels of the covalent product. However, upon closer inspection of the 
fluoroimage extremely faint bands were observed that could potentially represent 
minute amounts of the covalent intermediate (Figure 28; lanes 5-7). Based on 
these results, the ability of whole cells to generate a covalent, fluorescent alkyl- 

1 0 enzyme intermediate was investigated. 

Figure 29 shows the results of an in vivo labeling experiment comparing 
each of the DhaA.E130 mutants with positive (DhaA.H272F mutant) and 
negative (DhaA-) controls. As expected, the DhaA.H272F mutant was capable 
of generating a covalent alkyl-enzyme intermediate as evidenced by the single 

1 5 fluorescent band near the molecular weight predicted for the GST-DhaA-Flag 
fusion (Figure 29, lane 3). As previously observed with the in vitro labeling 
results, no such product could be detected with either the wild-type or negative 
control cultures (Figure 29, lanes 2 and 3) but very faint fluorescent bands 
migrating at the correct position were again detected with all three DhaA.E130 

20 substituted mutants (Figure 29, lanes 5-7). These results point to the possibility 
that the DhaA.E130Q, L and A mutants have the ability to trap covalent alkyl- 
enzyme intermediates. The efficiency of this reaction, however, appears to 
proceed at a dramatically reduced rate compared to the DhaA.H272F mutant 
enzyme. 

25 The results of this mutagenesis study suggest that the DhaA catalytic acid 

residue DhaA.E130 plays an important structural role in the correct folding of 
the enzyme. The DhaA protein was clearly sensitive to substitutions at this 
amino acid position as evidenced by the presence of largely insoluble protein 
complexes in the DhaA.E130Q, DhaA.E130L and DhaA.E130A crude lysates. 

30 Nevertheless, based on SDS-PAGE and immunoblot analyses, a significant 

quantity of soluble DhaA protein was detected in the cell-free lysates of all three 
DhaAJBl 30 mutants. 
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Example IX 

Capturing of DhaA.H272F-Flag anH r> h,aA.H272F-Fl ag Renilla LuciferasP 
Fusion Proteins E xpressed in Living Mammalian Cells 
Materials and Methods 

5 CHO-K1 cells were plated in 24 well plates (Labsystems) at a density of 

30,000 cells/cm 2 and transfected with a pCIneo.DhaA. WT-Flag or 
pdneo.hRLuc-connector-DhaA.H272F-Flag vector. Twenty-four hours later, 
media was replaced with fresh media containing 25 uM biotm-C 18 H 3 204-Cl and 
0.1% DMSO, or 0.1% DMSO alone, and the cells were placed in a COa 
10 incubator for 60 minutes. At the end of the incubation, the media was removed, 
cells were quickly washed with PBS (pH 7.4; four consecutive washes; 1 .0 
ml/cm 2 ; 5 seconds each) and new media was added to the cells. In some 
experiments, the media was not changed. The cells were placed back in a C02 
incubator. 

1 5 After 60 minutes, media was removed, and the cells were collected in 

PBS (pH=7.4, 200 ul/well, RT) containing protease inhibitors (Sigma #P8340). 
The cells were lysed by trituriation through a needle (Ml 23GTW). Then, cell 
lysates were incubated with MagnaBind Streptavidin coated beads (Pierce 
#21344) according to the manufacturer's protocol. Briefly, cell lysates were 
incubated with beads for 60 minutes at room temperature (RT) using a rotating 
disk. Unbound material was collected; beads were washed with PBS (3 x 500 
ul, pH=7.4, RT) and resuspended in SDS-sample buffer (for SDS-PAGE 
analysis) or PBS (pH=7.4, for determination of RXuc activity). Proteins were 
resolved on SDS-PAGE, transferred to a nitrocellulose membrane, analyzed with 
anti-Flag-Ab or anti-R.Luc-Ab, and bound antibody detected by an enhanced 
cheinuuminescenceCECL) system (Pharmacia-Amersham). Activity of hRLuc 
bound to beads was determined using Promega's "Renilla Luciferase Assay 
System" according to the manufacturer's protocol. 
Results 

Capturing of proteins expressed in living cells allows for analysis of 
those proteins with a variety of analytic methods/techniques. A number of 
capturing tools are available although most of those tools require generation of a 
highly specific antibody or genetically fusing a protein of interest with specific 
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tag peptides/proteins (Jarvik and Telmer, 1998; Ragaut et aL, 1999). However, 
those tags have only limited use for live cell imaging. To capture DhaA.H272F 
and functional proteins fused to DhaA.H272F, S A-coated beads were used 
(Savage et al., 1992). 
5 Biotm-Ci8H32C>4-Ci was efficiently hydrolyzed by wild-type DhaA, and 

covalently bound to DhaA.H272F and DhaA.H272F fusion proteins in vitro and 
in vivo. Moreover, binding was observed both in E. coli and in mammalian 
cells. Control experiments indicated that about 80% of the DhaA.H272F-Flag 
protein expressed in CHO-K1 cells was labeled after a 60 minute treatment. 

1 0 CHO-K1 cells transiently expressing DhaA.H272F-Flag were treated 

with biotm-Ci8H 32 04-CL Biotin-CisH^CU-Cl treated cells were lysed and cell 
lysates were incubated with SA-coated beads. Binding of DhaA.H272F to beads 
was analyzed by Western blot using anti-Flag* antibody. As shown in Figure 
30D, DhaA.H272F-Flag capturing was not detected in the absence of biotin- 

1 5 C18H32O4-CI treatment. At the same time, more than 50% of the DhaA.H272F- 
Flag expressed in cells was captured on SA-coated beads if the cells were treated 
with biotin-Ci8H 32 04-Cl. 

To show the capturing of functionally active proteins fused to 
DhaA.H272F-Flag, cells were transfected with a vector encoding bRXuc- 

20 connector-DhaAH272F-Flag, and the luciferase activity captured on the beads 
measured. As shown in Figure 30C, significant luciferase activity was detected 
on beads incubated with a lysate of biotin-Ci8H 3 20 4 -Cl treated cells. At the 
same time, no luciferase activity was detected on beads incubated with a lysate 
from cells that were not treated with biotin-Cisfi^CVCl. Moreover, no hRXuc 

25 activity was detected on beads incubated with lysate from the cells treated with 
biotin-CigH3204-Cl when free biotin-Ci8H 3 20 4 -Cl was not washed out. 

Taken together, these data show that functionally active protein (hRXuc) 
fused to the DhaA.H272F can be efficiently captured using biotin-Ci8H 3 20 4 -Cl 
and SA-coated beads. The capture is biotin-dependent, and can be competed-off 

30 by excess of biotin-CiaH^C^-Cl. As a significant inhibitory effect of the beads 
on the hRXuc activity was observed (data not shown), SDS-PAGE and Western 
blot analysis with anti-RXuc antibody were used to estimate the efficiency of 
capture of hR.Luc-connector-DhaA.H272F-Flag fusion protein. As shown in 
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Figure 30D, more than 50 % of hRXuc-connector-DhaA.H272F-Flag fusion 
protein can be captured in biotin-dependent manner. This is in good agreement 
with the capturing efficiency of DhaA.H272F-Flag (see Figure 30A). 

5 Example X 

Optimized DhaA Gene 
DhaA General Sequence Desig n 

A synthetic DJiaAH212V gene was prepared which had a human codon 
bias, low CG content, selected restriction enzyme recognition sites and a reduced 

1 0 number of transcription regulatory sites. Relative to the amino sequence 
encoded by a wild-type DhaA gene which lacks a signal sequence (SEQ ID 
NO:51), and/or to DAaA.H272F, the amino acid sequence of a codon-optimized 
DhaA gene and flanking sequences included: 1) a Gly inserted at position 2, due 
to introduction of an improved Kozak sequence (GCCACCATGG; SEQ ID 

1 5 NO:45) and a BamBl site (thus the H272F active site mutation in DhaA mutants 
with the Gly insertion is at position 273); 2) a A292G substitution due to 
introduction of a SmaVXmaVAval site which, in the DJiaA mutant with the Gly 
insertion, is at position 293; 3) the addition of Ala-Gly at the C-terminus due to 
introduction of uNael (NgoMTV) site; 4) the addition of Mel, PvuU 9 EcoRV and 

20 Ncol sites in the 5' flanking sequence; 5) the addition of NNNN in the 5' flanking 
sequence to eliminate search algorithm enrors at the end and to maintain the 
ORF1 (i.e., NNN-NGC-TAG-CCA-GCT-GGC-GAT-ATC-GCOACC-ATG- 
GGA; SEQ ID NO:46); 6) at the 3' end a Notl site, the addition of NNNN to 
eliminate search algorithm errors at the end, a Pad site with ORF Leu-Ile-Lys, 
25 and two stop codons, at least one of which is a TAA (i.e., 

TAATAGTTAATTAAGTAAGCGGCCGCNNNN; SEQ ID NO:47). 
SEQ ID NO:51 has the following sequence: 

atgtcagaaatcggtacaggcttccccttcgacccccattatgtggaagtcctgggcgagcgtat 
g ca ctacgtcgatgttggaccgcgggatggcacgcctgtgctgttcctgcacggtaacccgacct 
cgtcctacctgtggcgcaacatcatcccgcatgtagcaccgagtcatcggtgcattgctccagac 
ctgatcgggatgggaaaatcggacaaaccagacctcgattatttcttcgacgaccacgtccgcta 
cctcgatgccttcatcgaagccttgggtttggaagaggtcgtcctggtcatccacgactggggct 
cagctctcggattccactgggccaagcgcaatccggaacgggtcaaaggtattgcatgtatggaa 
ttcatccggcctatcccgacgtgggacgaatggccggaattcgcccgtgagaccttccaggcctt 
35 ccggaccgccgacgtcggccgagagttgatcatcgatcagaacgctttcatcgagggtgcgctcc 
cgaaatgcgtcgtccgtccgcttacggaggtcgagatggaccactatcgcgagcccttcctcaag 
cctgttgaccgagagccactgtggcgattccccaacgagctgcccatcgccggtgagcccgcgaa 
catcgtcgcgctcgtcgaggcatacatgaactggctgcaccagtcacctgtcccgaagttgttgt 
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tctggggcacacccggcgtactgatccccccggccgaagccgcgagacttgccgaaagcctcccc 
aactgcaagacagtggacatcggcccgggattgcactacctccaggaagacaacccggaccttat 
cggcagtgagatcgcgcgctggctccccgcactctag 

5 Codon Selection 

Codon usage data was obtained from the Codon Usage Database 
fhttp://www>kazusa.or.ip/codon/) , which is based on: GenBank Release 13 1.0 of 
15 August 2002 (See, Nakamura et al., 2000). Codon usage tables were 
downloaded for: HS: Homo sapiens [gbpri] 50,031 CDS's (21,930,294 codons); 

10 MM: Mus musculus [gbrod] 23,113 CDS's (10,345,401 codons); EC: 
Escherichia coli [gbbct] 1 1,985 CDS's (3,688,954 codons); and EC K12: 
Escherichia coli K12 [gbbct] 4,291 CDS's (1,363,716 codons). HS and MM 
were compared and found to be closely similar, thus the HS table was used. EC 
and EC K12 were compared and found to be closely similar, therefore the EC 

15 Kl 2 table was employed. 

The overall strategy for selecting codons was to adapt codon usage for 
optimal expression in mammalian cells while avoiding low-usage E, coli codons. 
One "best" codon was selected for each amino acid and used to back-translate 
the desired protein sequence to yield a starting gene sequence. Another selection 

20 criteria was to avoid high usage frequency HS codons which contain CG 

dinucleotides, as methylation of CG has been implicated in transcriptional gene 
regulation and can cause down-regulation of gene expression in stable cell lines. 
Thus, all codons containing CG (8 human codons) and TA (4 human codons, 
except for Tyr codons) were excluded. Codons ending in C were also avoided as 

25 they might form a CG with a downstream codon. Of the remaining codons, 
those with highest usage in HS were selected, unless a codon with a slightly 
lower usage had substantially higher usage in E. coli. 
DhaA Gene Sequences 

To generate a starting DhaA sequence, codon usage tables in Vector NTI 

30 8.0 (Informax) were employed. The DhaA.v2.1 protein sequence (SEQ ID 
NO:48) was back translated to create a starting gene sequence, hDAaA.v2.1-0, 
and flanking regions were then added, as described above, to create hZ)AaA.v2.1- 
0F (SEQIDNO:49). 
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DhaA.v2.1: 

MGSEIGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYLW 

RNIIPHVAPSHRCIAPDLIGMGKSDKPDLDYFFDDHVRYLDAFIEALGLEE 

WLVIHDWGSALGFHWAKR]^ERVKGIACMEFIRPffTWDEWPEFARET 

5 FQAFRTADVGRELIIDQNAFIEGALPKCWRPLTEVEMDHYREPFLKPVD 

REPLWRFPNELPIAGEPANIV ALVEAYMNWLHQSPVPKLLFWGTPGVLIP 

PAEAARLAESLPNCKTVDIGPGLFYLQEDNPDLIGSEIARWLPGLAG 
(SEQIDNO:48) 

10 hDhaA.v2.1-0F: 

NNNNGCTAGCCAGCTGGCGATATCGCCACCATGGGATCCGAGATTGG 

GACAGGGTTTCCTTTTGATCCTCATTATGTGGAGGTGCTGGGGGAGA 

GAATGCATTATGTGGATGTGGGGCCTAGAGATGGGACACCTGTGCTG 

TTTCTGCATGGGAATCCTACATCTrCTTATCTGTGGAGAAATATTATT 

15 CCTCATGTGGCTCCTTCTCATAGATGTATTGCTCCTGATCTGATTGGG 

ATGGGGAAGTCTGATAAGCCTGATCTGGATTATTTTTTTGATGATCAT 

GTGAGATATCTGGATGCTTTTATTGAGGCTCTGGGGCTGGAGGAGGT 

GGTGCTGGTGATTCATGATTGGGGGTCTGCTCTGGGGTTTCATTGGGC 

TAAGAGAAATCCTGAGAGAGTGAAGGGGATTGCTTGTATGGAGTTTA 

20 TTAGACCTATTCGTACATGGGATGAGTGGCCTGAGTTTGCTAGAGAG 

ACATTTCAGGCTTTTAGAACAGCTGATGTGGGGAGAGAGCTGATTAT 

TGATCAGAATGCTTTTATTGAGGGGGCTCTGCCTAAGTGTGTGGTGA 

GACCTCTGACAGAGGTGGAGATGGATCATTATAGAGAGCCTTTTCTG 

AAGCCTGTGGATAGAGAGCCTCTGTGGAGATTTCCTAATGAGCTGCC 

25 TATTGCTGGGGAGCCTGCTAATATTGTGGCTCTGGTGGAGGCTTATAT 

GAATTGGCTGCATCAGTCTCCTGTGCCTAAGCTGCTGTrrTGGGGGAC 

ACCTGGGGTGCTGATTCCTCCTGCTGAGGCTGCTAGACTGGCTGAGTC 

TCTGCCTAATTGTAAGACAGTGGATATTGGGCCTGGGCTGTTTTATCT 

GCAGGAGGATAATCCTGATCTGATTGGGTCTGAGATTGCTAGATGGC 

30 TGCCCGGGCTGGCCGGCTAATAGTTAATTAAGTAAGCGGCCGCNNNN 
(SEQIDNO:49) 
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Further Optimization 

Programs and databases used for identification and removal of sequence 
motifs were from Genomatix Software GmbH (Munich, Germany, 
http://www.genomatix.deV GEMS Launcher Release 3.5.1 (April 2003), 
5 Matlnspector professional Release 6. 1 (January 2003), Matrix Family Library 
Ver 3.1.1 (April 2003, including 318 vertebrate matrices in 128 families), 
Modellnspector professional Release 4.8 (October 2002), Model Library Ver 3.1 
(March 2003, 226 modules), SequenceShaper tool, and User Defined Matrices. 
The sequence motifs to be removed from starting gene sequences in order of 

1 0 priority were restriction enzyme recognition sequences listed below; 

transcription factor binding sequences including promoter modules (i.e., 2 
transcription factor binding sites with defined orientation) with a default score or 
greater, and vertebrate transcription factor binding sequences with a minimum 
score of =0.75/matrix=optimized; eukaryotic transcription regulatory sites 

15 including a Kozak sequence, splice donor/acceptor sequences, polyA addition 
sequences; and prokaryotic transcription regulatory sequences including E. coli 
promoters and R coli RBS if less than 20 bp upstream of a Met codon. 
User-defined Matrices 
Subset DhaA 

20 Format: Matrix name (core similarity threshold/matrix similarity 

threshold): U$AatII (0.75/1 .00), U$BamHI (0.75/1 .00), U$BglI (0.75/1 .00), 
U$BglII (0.75/1.00), U$BsaI (0.75/1.00), U$BsmAI (0.75/1.00), U$BsmBI 
(0.75/1.00), U$BstEH (0.75/1.00), U$BstXI (0.75/1.00), U$Csp45I (0.75/1.00), 
U$CspI (0.75/1.00), U$DraI (0.75/1.00), USEC-P-10 (1.00/Optimized), U$EC- 

25 P-35 (1.00/Optimized), U$EC-Prom (1.00/Optimized), U$EC-RBS (0.75/1.00), 
USEcoRI (0.75/1.00), U$EcoRV (0.75/1.00), USHindHI (0.75/1.00), U$Kozak 
(0.75/Optimized), U$KpnI (0.75/1.00), U$MluI (0.75/1.00), U$NaeI (0.75/1.00), 
U$NcoI (0.75/1.00), U$NdeI (0.75/1.00), U$NheI (0.75/1.00), U$NotI 
(0.75/1.00), U$NsiI (0.75/1.00), U$PacI (0.75/1.00), USPflMI (0.75/1.00), 

30 U$PmeI (0.75/1.00), U$PolyAsig (0.75/1.00), U$PstI (0.75/1.00), U$PvuII 

(0.75/1.00), U$SacI (0.75/1.00), U$SacH (0.75/1.00), U$SalI (0.75/1.00), U$SfiI 
(0.75/1.00), U$SgfI (0.75/1.00), U$SmaI (0.75/1.00), U$SnaBI (0.75/1.00), 
U$SpeI (0.75/1.00), U$Splice-A (0.75/Optimized), U$SpUce-D 
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(0.75/Optimized), U$XbaI (0.75/1.00), U$XcmI (0.75/1.00), U$XhoI 
(0.75/1.00), and ALL vertebrates.lib. 

Subset DbaA-EC 

Without E. coli specific sequences: USAatn (0.75/1.00), U$BamHI 
5 (0.75/1.00), U$BglI (0.75/1.00), U$Bgin (0.75/1.00), U$BsaI (0.75/1.00), 
U$BsmAI (0.75/1.00), U$BsmBI (0.75/1.00), U$BstEn (0.75/1.00), U$BstXI 
(0.75/1.00), U$Csp45I (0.75/1.00), U$CspI (0.75/1.00), U$DraI (0.75/1.00), 
U$EcoRI (0.75/1.00), U$EcoRV (0.75/1.00), U$Hindm (0.75/1.00), U$Kozak 
(0.75/Optimized), U$KpnI (0.75/1.00), U$MluI (0.75/1.00), U$NaeI (0.75/1.00), 

10 U$NcoI (0.75/1.00), USNdel (0.75/1.00), U$NheI (0.75/1.00), U$NotI 
(0.75/1.00), U$NsiI (0.75/1.00), U$PacI (0.75/1.00), U$PflMI (0.75/1.00), 
U$PmeI (0.75/1.00), U$PolyAsig (0.75/1.00), U$PstI (0.75/1.00), U$PvuH 
(0.75/1.00), U$SacI (0.75/1.00), U$SacII (0.75/1.00), U$SalI (0.75/1.00), U$SfiI 
(0.75/1.00), U$Sgfl (0.75/1.00), U$SmaI (0.75/1.00), U$SnaBI (0.75/1.00), 

15 U$SpeI (0.75/1.00), U$Splice-A (0.75/Optimized), U$Splice-D 

(0.75/Optimized), U$XbaI (0.75/1.00), U$XcmI (0.75/1.00), U$XhoI 
(0.75/1.00), and ALL vertebrates.lib. 

Strategy for Removal of Sequence Motifs 

The undesired sequence motifs specified above were removed from the 
20 starting gene sequence by selecting alternate codons that allowed retention of the 
specified protein and flanking sequences. Alternate codons were selected in a 
way to conform to the overall codon selection strategy as much as possible. 
A. General Step s 

- Identify undesired sequence matches with Matlnspector using matrix family 
25 subset "DhaA" or "DhaA-EC" and with Modellnspector using default 

settings. 

■ Identify possible replacement codons to remove undesired sequence matches 
with SequenceShaper (keep ORF). 

- Incorporate all changes into a new version of the synthetic gene sequence 
30 and re-analyze with Matlnspector and Modellnspector. 
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B. Specific Steps 

- Remove undesired sequence matches using subset "DhaA-EC" and 
SequenceShaper default remaining thresholds (0.70/Opt-0.20). 

- For sequence matches that cannot be removed with this approach use lower 
5 SequenceShaper remaining thresholds (e.g., 0.70/Opt-0.05). 

- For sequence matches that still cannot be removed, try different 
combinations of manually chosen replacement codons (especially if more 
than 3 base changes might be needed). If that introduces new sequence 
matches, try to remove those using the steps above (a different starting 

1 0 sequence sometimes allows a different removal solution). 

- Use subset "DhaA" to check whether problematic E. coli sequences motifs 
were introduced, and if so try to remove them using an analogous approach 
to that described above for non E. coli sequences. 

Use an analogous strategy for the flanking (non-open reading frame) sequences. 
15 C. Identification and Removal of Putative CpG Islands 
Software used: EMBOSS CpGPlot / CpGReport 
http://www.ebi.ac.uk/emboss/cpgplot/index.html) (see, Gardiner-Garden et al., 
1987). 

Parameters: default (modified): Window: 100; Step: 1; Obs/Exp: 0.6; 

20 MinPC: 50; Length: 100; Reverse: no; Complement: no. After the removal of 
undesired sequence motifs, the gene sequence was checked for putative CpG 
islands of at least 1 00 bases using the software described above. If CpG islands 
were identified, they were removed by selecting, at some of the CG di- 
nucleotide positions, alternate codons that allowed retention of the specified 

25 protein and flanking sequences, but did not introduce new undesired sequence 
motifs. 

D. Restriction Sites 

A unique MunVMfel (C'AATTG) site was introduced to allow removal of 
the C-terminal 34 amino acids, including a putative myristylation site (GSEIAR) 
30 near the C-terminus. Another unique site, a Nrul site, was introduced to allow 
removal of the C-terminal 80-100 amino acids. 

M)haA.v2.1-6F (FINAL, with flanking sequences) 
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NNNNGCTAGCCAGCTGGCgcgGATATCGCCACCATGGGATCCGAGATT 
GGGACAGGGTTcCCTTTTGATCCTCAcTATGTtGAaGTGCTGGGgGAaAG 
AATGCAcTAcGTGGATGTGGGGCCTAGAGATGGGACcCCaGTGCTGTTc 
CTcCAcGGGAAcCCTACATCTagcTAcCTGTGGAGaAAtATTATaCCTCAT 
5 GTtGCTCCTagtCATAGgTGcATTGCTCCTGATCTGATcGGGATGGGGAA 
GTCTGATAAGCCTGActtaGAcTAcTTTTTTGATGAtCATGTtcGATActTGG 
ATGCTTTcATTGAGGCTCTGGGGCTGGAGGAGGTGGTGCTGGTGATaC 
AcGAcTGGGGGTCTGCTCTGGGGTTTCAcTGGGCTAAaAGgAATCCgGA 
GAGAGTGAAGGGGATTGCTTGcATGGAgTTTATTcGACCTATTCCTACt 

10 TGGGAtGAaTGGCCaGAGTTTGCcAGAGAGACATTTCAaGCcTTTAGAA 
CtGCcGATGTGGGcAGgGAGCTGATTATaGAcCAGAATGCTTTcATcGAG 
GGGGCTCTGCCTAAaTGTGTaGTcAGACCTCTcACtGAaGTaGAGATGGA 
cCATTATAGAGAGCCcTTTCTGAAGCCTGTGGATcGcGAGCCTCTGTGG 
AGgTTtCCaAATGAGCTGCCTATTGCTGGGGAGCCTGCTAATATTGTGG 

15 CTCTGGTGGAaGCcTATATGAAcTGGCTGCATCAGagTCCaGTGCCcAAG 

CTaCTcTTTTGGGGGACtCCgGGaGTtCTGATTCCTCCTGCcGAGGCTGCT 

AGACTGGCTGAaTCcCTGCCcAAtTGTAAGACcGTGGAcATcGGcCCtGGg 

CTGTTTTAcCTcCAaGAGGAcAAcCCTGATCTcATcGGGTCTGAGATcGCa 

cGgTGGCTGCCCGGGCTGGCCGGCTAATAGTTAATTAAGTAgGCGGCC 
20 GCNNNN (SEQ ID NO:50) 



25 



A comparison of the nucleic acid sequence identity of different DhaA 
genes (without flanking sequences) is shown in Table HI. 
Table III 





DhaA 


DhaA.v2.1 


hDhaA.v.2.1-0 


hDbaAv2.1-6 


DhaA 


100 


98 


72 


75 


DhaA.v2.1 a 




100 


74 


76 


hDhaA.v.2.1-0 b 






100 


88 


hDhaA.v2.1-6 








100 



b codon optimized 



30 



The GC content of different DhaA genes (without flanking sequences) is 
provided in Table IV. 
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-. GC content 


CG-di-nucleotides 


.H. sapiens ,.' 


53% 




DhaA 


60% 


85 


DhaA.v2.1 ' ■ 


60% 


87 


hDhaA.v.2.1-0 


49% 


3 


hDhaA.v2.1-6 


52% 


21 



Vertebrate transcription factor binding sequence families (core similarity: 
0.75/matrix similarity: opt) and promoter modules (default parameters: 
5 optimized threshold or 80% of maximum score) found in different DhaA genes 
are shown in Table V. 



Table V 



Gene name 


TP binding 
sequences 
. 5'F/ORF/3'F 


Promoter modules 
5'F/ORF/3'F 


DhaA 


-/82/~ 


»/5/~ 


DhaA.v2.1-F 


3/82/12 


0/5/0 [ 


hDhaA.v.2.1-0F 


3/87/12 


0/0/0 


hDhaA.v2.1-6F 


1/3/8 


0/0/0 



10 Note: 3 bp insertion before EcoKV in hDhaA.v.2. 1 -OF and in hDhaA.v2. 1 -6F to 
remove 5' binding sequence matches in 3' flanking region. 

The remaining transcription factor binding sequence matches in 
hDhaA.v2.1-6F included 

in the 5' flanking region: Family: VSNEUR (NeuroD, Beta2, HLH 

15 domain), best match: DNA binding site for NEUROD 1 (BETA-2 / E47 dimer) 
(MEDLINE 9108015) : in the open reading frame: Family: VSGATA (GATA 
binding factors), best match: GATA-binding factor 1 (MEDLINE 94085373V 
Family: VSPCAT (Promoter CCAAT binding factors), best match: cellular and 
viral CCAAT box, (MEDLINE 90230299) . Family: V$RXRF (RXR 

20 heterodimer binding sites), best match: Farnesoid X - activated receptor 
(RXR/FXR dimer) (MEDLINE 11792716) : and in the 3' flanking region: 
Family: VSHNFl (Hepatic Nuclear Factor 1), best match: Hepatic nuclear factor 
1 (MEDLINE 95194383) . Family: VSBRNF (Bm POU domain factors), best 
match: POU transcription factor Bm-3 (MEDLINE 9111308) . Family: VSRBIT 
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(Regulator of B-Cell IgH transcription), best match: Bright, B cell regulator of 
IgH transcription (MEDLINE 96127903^ Family: V$CREB (Camp-Responsive 
Element Binding proteins), best match: E4BP4, bZEP domain, transcriptional 
repressor (MEDLINE 92318924), Family: VSHOMS (Homeodomain subfamily 
5 S8), best match: Binding site for S8 type homeodomains (MEDLINE 

94051593), Family: VSNKXH (NKX/DLX - Homeodomain sites), best match: 
DLX-1, -2, and -5 binding sites (MEDLINE 11798166V Family: VSTBPF (Tata- 
Binding Protein Factor), best match: Avian C-type LTR TATA box (MEDLINE 
6322120), and Family: VSNKXH (NKX/DLX - Homeodomain sites), best 
1 0 match: Prostate-specific homeodomain protein NKX3.1 (MEDLINE 10871372V 
The other sequence motifs remaining in hDhaA.v2.1-6F in the open 
reading frame were for an E. coli RBS (AAGG) 1 1 b upstream of a Met codon 
which was not removed due to retain the protein sequence (Lys-Gly: AA(A/G> 
GGN), and a BsmAI restriction site (GTCTC) which was not removed due to 
1 5 introduction of transcription factor binding site sequences. 

The putative CpG islands in the coding sequence for each of the DhaA 
genes was analyzed as in EMBOSS CpGPlot/CpGReport with default 
parameters, and the results are shown in Table VI. 

20 Table VI 



Gene name 


CpG Islands >1 00 
bp 


Length bp (location in ORF) 


DhaA 


1 


775 bp (49..823) 


DhaA.v2.1 


1 


784bp(49..832) 


hDhaA.v.2.1-0 


0 




hDhaA.v2.1-6 


0 
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All publications, patents and patent applications are incorporated herein 
by reference. While in the foregoing specification this invention has been 
described in relation to certain preferred embodiments thereof, and many details 
have been set forth for purposes of illustration, it will be apparent to those skilled 
> in the art that the invention is susceptible to additional embodiments and that 
certain of the details described herein may be varied considerably without 
departing from the basic principles of the invention. 
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WHAT IS CLAIMED IS: 

1 . A compound of formula (I): R-linker-A-X, wherein R is one or more 
functional groups, wherein the linker is a multiatom straight or branched 
chain including C, N, S, or O, wherein A-X is a substrate for a 
dehalogenase, and wherein X is a halogen. 

2. The compound of claim 1 which is a substrate for a Rhodococcus 
dehalogenase. 

3. The compound of claim 1 wherein X is CI or Br. 

4. The compound of claim 1 wherein A is (CH2) n and n = 4-1 0. 

5. The compound of claim 1 wherein the linker comprises (ClfeCEbCOy and 
y = 2-8. 

6. The compound of claim 1 wherein the linker separates R and A by at 
least 12 atoms. 

7. The compound of claim 1 wherein the linker comprises 3 to 30 atoms. 
8. 
9. 

10. The compound of claim 1 wherein at least one functional group is an 
amino acid, protein, glycoprotein, nucleic acid molecule, drug, lipid, 
biotin, or solid support. 



The compound of claim 1 wherein the linker has 1 1 to 30 atoms. 
The compound of claim 1 which is 



O 




H 
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11. The compound of claim 1 wherein at least one functional group is an 
optically detectable molecule. 

12. Ue compound of claim 1 1 wherein at least one functional group is a 
fluorophore. 



13. The compound of claim 1 wherein R is one of 
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and wherein Ri is Q-Cg. 

14. The compound of claim 1 which comprises two functional groups. 

15. The compound of claim 1 wherein at least one functional group hinds 
Ca 2+ , hinds K + , binds Na + , is pH sensitive, is a radionuclide, is electron 
opaque, is a chromophore, is a MRI contrast agent, fluoresces in the 
presence of NO or is sensitive to a reactive oxygen. 

16. A mutant dehalogenase comprising at least one amino acid substitution 
relative to a corresponding wild-type dehalogenase, wherein the mutant 
dehalogenase forms a bond with a dehalogenase substrate which 
comprises one or more functional groups, which bond is more stable than 
the bond formed between the corresponding wild-type dehalogenase and 
the substrate, wherein the at least one amino acid substitution in the 
mutant dehalogenase is a substitution at an amino acid residue in the 
corresponding wild-type dehalogenase that is associated with activating a 
water molecule which cleaves the bond formed between the 
corresponding wild-type dehalogenase and the substrate or at an amino 
acid residue in the corresponding wild-type dehalogenase that forms an 
ester intermediate with the substrate, wherein the substituted amino acid 
at a residue associated with activating a water molecule is not glutamine 
or asparagine. 

1 7. The mutant dehalogenase of claim 1 6 wherein the substitution is at a 
residue in the wild-type dehalogenase that activates the water molecule. 

18. The mutant dehalogenase of claim 17 wherein the residue in the wild- 
type dehalogenase that activates the water molecule is histidine. 

1 9. The mutant dehalogenase of claim 1 6 wherein the substitution is at a 
residue in the wild-type dehalogenase which forms an ester intermediate 
with the substrate. 
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20. The mutant dehalogenase of claim 19 wherein the residue in the wild- 
type dehalogenase which forms the ester intermediate is aspartate. 

2 1 . The mutant dehalogenase of claim 1 6 wherein the at least one 
substitution is at a position corresponding to amino acid residue 272 of a 
Rhodococcus rhodochrous dehalogenase. 

22. The mutant dehalogenase of claim 21 wherein the substituted amino acid 
at the position corresponding to amino acid residue 272 is phenylalanine, 
glycine or alanine. 

23. The mutant dehalogenase of claim 1 6 wherein the at least one 
substitution is at a position corresponding to amino acid residue 1 06 of a 
Rhodococcus rhodochrous dehalogenase, 

24. The mutant dehalogenase of claim 23 wherein the substituted amino acid 
at the position corresponding to amino acid residue 106 is cysteine or 
glutamate. 

25. The mutant dehalogenase of claim 1 6 further comprising a protein of 
interest, thereby yielding a fusion protein. 

26. The mutant dehalogenase of claim 25 wherein the protein of interest is a 
selectable marker protein, membrane protein, cytosolic protein, nuclear 
protein, structural protein, an enzyme, an enzyme substrate, a receptor 
protein, a transporter protein, a transcription factor, a channel protein, a 
phospho-protein, a kinase, a signaling protein, a metabolic protein, a 
mitochondrial protein, a receptor associated protein, a nucleic acid 
binding protein, an extracellular matrix protein, a secreted protein, a 
receptor ligand, a serum protein, an immunogenic protein, a fluorescent 
protein, or a protein with reactive cysteines. 
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27. The mutant dehalogenase of claim 16 which has at least 85% amino acid 
sequence identity to the corresponding wild-type dehalogenase. 

28. A mutant serine beta-lactamase comprising at least two amino acid 
substitutions relative to a corresponding wild-type serine beta-lactamase, 
wherein the mutant serine beta-lactamase forms a bond with a serine 
beta-lactamase substrate which comprises one or more functional groups, 
which bond is more stable than the bond formed between the 
corresponding wild-type serine beta-lactamase and the substrate and the 
bond formed between a mutant beta-lactamase having one of the 
substitutions and the substrate, wherein the at least two amino acid 
substitutions in the mutant serine beta-lactamase are substitutions at 
amino acid residues in the corresponding wild-type serine beta-lactamase 
that are associated with activating a water molecule which cleaves the 
bond formed between the corresponding wild-type serine beta-lactamase 
and the substrate, and wherein the at least two amino acid substitutions in 
the mutant serine beta-lactamase are associated with the stability of the 
bond formed between the mutant serine beta-lactamase and the substrate. 

29. The mutant beta-lactamase of claim 28 wherein the substitutions are at 
positions corresponding to amino acid residue 166 and amino acid 
residue 170 of a Staphylococcus aureus serine beta-lactamase. 

3 0. The mutant beta-lactamase of claim 29 wherein the substituted amino 
acid at the position corresponding to amino acid residue 166 in the 
mutant beta-lactamase is an aspartic acid. 

3 1 . The mutant beta-lactamase of claim 28 wherein the substituted amino 
acid at the position corresponding to amino acid residue 170 in the 
mutant beta-lactamase is a glutamine. 

32. The mutant beta-lactamase of claim 3 1 wherein the substituted amino 
acid at the position corresponding to amino acid residue 1 66 in the 
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mutant beta-lactamase is an aspartic acid and the substituted amino acid 
at the position corresponding to amino acid residue 170 in the mutant 
beta-lactamase is a gjutamine. 

33. The mutant beta-lactamase of claim 28 further comprising a protein of 
interest, thereby yielding a fusion protein. 

34. The mutant beta-lactamase of claim 33 wherein the protein of interest is a 
selectable marker protein, membrane protein, cytosolic protein, nuclear 
protein, structural protein, an enzyme, an enzyme substrate, a receptor 
protein, a transporter protein, a transcription factor, a channel protein, a 
phospho-protein, a kinase, a signaling protein, a metabolic protein, a 
mitochondrial protein, a receptor associated protein, a nucleic acid 
binding protein, an extracellular matrix protein, a secreted protein, a 
receptor ligand, a serum protein, an immunogenic protein, a fluorescent 
protein, or a protein with reactive cysteines. 

35. A method to detect or determine the presence or amount of a mutant 
hydrolase, comprising: 

a) contacting a mutant hydrolase with a hydrolase substrate which 

comprises one or more functional groups, wherein the mutant hydrolase 
comprises at least one amino acid substitution relative to a corresponding 
wild-type hydrolase, wherein the at least one amino acid substitution 
results in the mutant hydrolase forming a bond with the substrate which 
is more stable than the bond formed between the corresponding wild-type 
hydrolase and the substrate, wherein the at least one amino acid 
substitution in the mutant hydrolase is a substitution at an amino acid 
residue in the corresponding wild-type hydrolase that is associated with 
activating a water molecule which cleaves the bond formed between the 
corresponding wild-type hydrolase and the substrate or at an amino acid 
residue in the corresponding wild-type hydrolase that forms an ester 
intermediate with the substrate; and 
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b) detecting or determining the presence or amount of the functional group, 
thereby detecting or determining the presence or amount of the mutant 
hydrolase. 

36. The method of claim 35 wherein the substitution is at a residue in the 
wild-type hydrolase that activates the water molecule. 

37. The method of claim 36 wherein the residue in the wild-type hydrolase 
that activates the water molecule is histidine. 

38. The method of claim 35 wherein the substitution is at a residue in the 
wild-type hydrolase that forms an ester intermediate with the substrate. 

39. The method of claim 38 wherein the residue in the wild-type hydrolase 
that forms an ester intermediate with the substrate is aspartate. 

40. A method to isolate a molecule, cell or subcellular organelle of interest in 
a sample, comprising: 

a) contacting a sample with a fusion protein comprising a mutant 
hydrolase and a hydrolase substrate which comprises one or more 
functional groups, wherein the mutant hydrolase comprises at least one 
amino acid substitution relative to a corresponding wild-type hydrolase, 
wherein the at least one amino acid substitution results in the mutant 
hydrolase forming a bond with the substrate which is more stable than 
the bond formed between the corresponding wild-type hydrolase and the 
substrate, wherein the at least one amino acid substitution in the mutant 
hydrolase is a substitution at an amino acid residue in the corresponding 
wild-type hydrolase that is associated with activating a water molecule 
which cleaves the bond formed between the corresponding wild-type 
hydrolase and the substrate or at an amino acid residue in the 
corresponding wild-type hydrolase that forms an ester intermediate with 
the substrate, wherein the fusion protein comprises a protein which binds 
a molecule, cell or subcellular organelle of interest; and 
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b) isolating the molecule, cell or subcellular organelle of interest. 

41 . The method of claim 40 wherein the substitution is at a residue in the 
wild-type hydrolase that activates the water molecule. 

42. The method of claim 41 wherein the residue in the wild-type hydrolase 
that activates the water molecule is histidine. 

43. The method of claim 40 wherein the substitution is at a residue in the 
wild-type hydrolase that forms an ester intermediate with the substrate. 

44. The method of claim 43 wherein the residue in the wild-type hydrolase 
that forms an ester intermediate with the substrate is aspartate. 

45. The method of claim 40 wherein at least one functional group is a solid 
support or a molecule which binds to a solid support. 

46. The method of claim 40 wherein the molecule of interest is a protein. 

47. A method to label a cell, comprising: 

a) contacting a cell comprising a mutant hydrolase with a hydrolase 
substrate which comprises one or more functional groups, wherein the 
mutant hydrolase comprises at least one amino acid substitution relative 
to a corresponding wild-type hydrolase, wherein the at least one amino 
acid substitution results in the mutant hydrolase forming a bond with the 
substrate which is more stable than the bond formed between the 
corresponding wild-type hydrolase and the substrate, wherein the at least 
one amino acid substitution in the mutant hydrolase is a substitution at an 
amino acid residue in the corresponding wild-type hydrolase that is 
associated with activating a water molecule which cleaves a bond formed 
between the corresponding wild-type hydrolase and the substrate or at an 
amino acid residue in the corresponding wild-type hydrolase that forms 
an ester intermediate with the substrate; and 
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b) detecting or determining the presence or amount of the functional 
group. 

48. The method of claim 47 wherein the substitution is at a residue in the 
wild-type hydrolase that activates the water molecule. 

49. The method of claim 48 wherein the residue in the wild-type hydrolase 
that activates the water molecule is histidine. 

50. The method of claim 47 wherein the substitution is at a residue in the 
wild-type hydrolase that forms an ester intermediate with the substrate. 

5 1 . The method of claim 50 wherein the residue in the wild-type hydrolase 
that forms an ester intermediate with the substrate is aspartate. 

52. The method of any one of claims 35, 40 or 47 wherein the wild-type 
hydrolase is a dehalogenase. 

53. The method of claim 52 wherein the substrate comprises R-linker~A-X, 
wherein R is one or more functional groups, wherein the linker is a 
multiatom straight or branched chain including C, N, S, or O, wherein A- 
X is a substrate for a dehalogenase, and wherein X is a halogen. 

54. The method of claim 53 wherein A is (CH 2 ) n and n = 4-10. 

55. The method of claim 53 wherein the linker comprises (CH 2 CH 2 )y and y = 
2-8. 

56. The method of claim 53 wherein the linker separates R and A by at least 
12 atoms. 

57. The method of any one of claims-35, 40 or 47 wherein the wild-type 
hydrolase is a serine beta-lactamase. 
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58. The method of any one of claims 35, 40 or 47 wherein the mutant 
hydrolase is present in a cell or on the surface of a cell. 

59. The method of any one of claims 35, 40 or 47 wherein at least one 
functional group is an amino acid, protein, glycoprotein, nucleic acid 
molecule, drug, lipid, biotin, or solid support 

60. The method of any one of claims 35, 40 or 47 wherein at least one 
functional group is an optically detectable molecule. 

6 1 . The method of claim 60 wherein at least one functional group is a 
fluorophore. 

62. The method of any one of claims 35, 40 or 47 wherein the substrate 
comprises two functional groups. 

63. The method of any one of claims 35, 40 or 47 wherein at least one 
functional group binds Ca 2+ , binds K + , binds Na + , is pH sensitive, is 
electron opaque, is a chromophore, is a MRI contrast agent, is a 
radionuclide, fluoresces in the presence of NO or is sensitive to a reactive 
oxygen. 

64. The method of any one of claims 35, 40 or 47 wherein the presence of at 
least one functional group in a cell is correlated to the subcellular 
location of the mutant hydrolase. 

65. The method of any one of claims 35, 40 or 47 wherein the mutant 
hydrolase further comprises a protein of interest, thereby yielding a 
fusion protein. 

66. The method of claim 65 wherein the protein of interest is a selectable 
marker protein, membrane protein, cytosolic protein, nuclear protein, 

126 



WO 2004/072232 



PCTAJS2004/002607 



structural protein, an enzyme, an enzyme substrate, a receptor protein, a 
transporter protein, a transcription factor, a channel protein, a phospho- 
protein, a kinase, a signaling protein, a metabolic protein, a 
mitochondrial protein, a receptor associated protein, a nucleic acid 
binding protein, an extracellular matrix protein, a secreted protein, a 
receptor ligand, a serum protein, an immunogenic protein, a fluorescent 
protein, or a protein with reactive cysteine. 

67. The method of claim 47 wherein the mutant hydrolase further comprises 
a selectable marker protein. 

68. The method of claim 67 wherein at least one functional group in the 
substrate is a fluorophore. 

69. The method of claim 68 wherein the mutant hydrolase forms an ester 
bond with the substrate. 

70. The method of claim 68 wherein the mutant hydrolase forms a thioester 
bond with the substrate. 

71 . The method of claim 47 further comprising contacting the cell with a 
fixative prior to or after contacting the cell with the substrate. 

72. The method of claim 47 further comprising contacting the cell with a 
fixative concurrently with contacting the cell with the substrate. 

73. The method of claim 71 or 72 wherein the cell is fixed with methanol, 
acetone and/or paraformaldehyde. 

74. The method of claim 67 further comprising contacting the cell with a 
fixative prior to or after contacting the cell with the substrate. 
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75. The method of claim 67 further comprising contacting the cell with a 
fixative concurrently contacting the cell with the substrate. 

76. The method of claim 74 or 75 wherein the cell is fixed with methanol, 
acetone and/or paraformaldehyde. 



The method of claim 52 wherein the mutant dehalogenase is encoded by 
a nucleic acid sequence which is optimized for expression in a selected 
host cell. 



78. A compound of any of formula II-XXVHI. 

79. An isolated polynucleotide comprising a nucleic acid sequence encoding 
a dehalogenase, wherein the nucleic acid sequence for the dehalogenase 
is optimized for expression in a selected host cell. 

80. The isolated polynucleotide of claim 79 wherein the dehalogenase is a 
mutant dehalogenase comprising at least one amino acid substitution 
relative to a corresponding wild-type dehalogenase, wherein the mutant 
dehalogenase forms a bond with a dehalogenase substrate, which bond is 
more stable than the bond formed between the corresponding wild-type 
dehalogenase and the substrate, wherein the at least one amino acid 
substitution in the mutant dehalogenase is a substitution at an amino acid 
residue in the corresponding wild-type dehalogenase that is associated 
with activating a water molecule which cleaves the bond formed between 
the corresponding wild-type dehalogenase and the substrate or at an 
amino acid residue in the corresponding wild-type dehalogenase that 
forms an ester intermediate with the substrate. 



8 1 . The isolated polynucleotide of claim 79 wherein the nucleic acid 

sequence hybridizes to SEQ ID NO:50 or the complement thereof under 
medium stringency conditions. 
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82. The isolated polynucleotide of claim 8 1 wherein the nucleic acid 
sequence hybridizes to SEQ ID NO:50 or the complement thereof under 
high stringency conditions. 

83 . The isolated polynucleotide of claim 79 wherein the nucleic acid 
sequence encodes a mutant dehalogenase with at least 85% amino acid 
sequence identity to the corresponding wild-type dehalogenase. 

84. The isolated polynucleotide of claim 83 wherein the corresponding 
dehalogenase is encoded by SEQ ID NO:5 1 . 

85. The isolated polynucleotide of claim 79 wherein the nucleic acid 
sequence has less than 90% nucleic acid sequence identity to SEQ ID 
NO:51. 

86. The isolated polynucleotide of claim 79 wherein the nucleic acid 
sequence has at least 90% nucleic acid sequence identity to SEQ ID 
NO:50. 

87. An isolated cell comprising a polynucleotide encoding a fusion protein, 
wherein the fusion protein comprises a selectable marker protein and a 
protein which is capable of irreversibly or stably binding a substrate or a 
portion thereof which includes a functional group. 

88. The cell of claim 87 wherein the functional group is a fluorophore. 

89. The isolated cell of claim 87 wherein the protein which is capable of 
stably binding a substrate which includes a functional group is a mutant 
hydrolase comprising at least one amino acid substitution relative to a 
corresponding wild-type hydrolase, wherein the mutant hydrolase forms 
a bond with a hydrolase substrate which comprises a fluorophore, which 
bond is more stable than the bond formed between the corresponding 
wild-type hydrolase and the substrate, wherein the at least one amino 
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acid substitution in the mutant hydrolase is a substitution at an amino 
acid residue in the corresponding wild-type hydrolase that is associated 
with activating a water molecule which cleaves the bond formed between 
the corresponding wild-type hydrolase and the substrate or at an amino 
acid residue in the corresponding wild-type hydrolase that forms an ester 
intermediate with the substrate. 

90. The isolated cell of claim 87 wherein the protein irreversibly binds at 
least a portion of the substrate which includes the functional group. 

91. A method to label a cell, comprising: 

contacting cells comprising a fusion protein comprising a selectable 
marker protein and a second protein which is capable of irreversibly or 
stably binding a substrate or a portion thereof which includes a functional 
group, with the substrate. 

92. The method of claim 91 wherein cells which express the selectable 
marker protein are selected prior to contacting the cells with the 
substrate. 

93. The method of claim 91 wherein cells which express the selectable 
marker protein are selected after contacting the cells with the substrate. 

94. The method of claim 9 1 wherein the functional group is a fluorophore. 

95. The method of claim 91 further comprising contacting the cells with a 
fixative prior to or after contacting the cells with the substrate. 

96. The method of claim 91 further comprising contacting the cells with a 
fixative concurrently with contacting the cells with the substrate. 

97. A mutant hydrolase comprising at least one amino acid substitution 
relative to a corresponding wild-type hydrolase, wherein the mutant 
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hydrolase forms a bond with a hydrolase substrate which comprises one 
or more functional groups, which bond is more stable than the bond 
formed between the corresponding wild-type hydrolase and the substrate, 
wherein the at least one amino acid substitution in the mutant hydrolase 
is a substitution at an amino acid residue in the corresponding wild-type 
hydrolase that is associated with activating a water molecule which 
cleaves the bond formed between the corresponding wild-type hydrolase 
and the substrate or at an amino acid residue in the corresponding wild- 
type hydrolase that forms an ester intermediate with the substrate, 
wherein the substituted amino acid at a residue associated with activating 
a water molecule is not glutamine or asparagine. 

98. The mutant hydrolase of claim 97 wherein the substitution is at a residue 
in the wild-type hydrolase that activates the water molecule. 

99. The mutant hydrolase of claim 98 wherein the residue in the wild-type 
hydrolase that activates the water molecule is histidine. 

1 00. The mutant hydrolase of claim 97 wherein the substitution is at a residue 
in the wild-type hydrolase which forms an ester intermediate with the 
substrate. 

101. The mutant hydrolase of claim 1 00 wherein the residue in the wild-type 
hydrolase which forms the ester intermediate is aspartate. 

102. The mutant hydrolase of claim 97 wherein the at least one substitution is 
at a position corresponding to amino acid residue 272 of a Rhodococcus 
rhodochrous dehalogenase. 

1 03 . The mutant hydrolase of claim 1 02 wherein the substituted amino acid at 
the position corresponding to amino acid residue 272 is phenylalanine or 
glycine. 
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1 04. The mutant hydrolase of claim 97 wherein the at least one substitution is 
at a position corresponding to amino acid residue 106 of a Rhodococcus 
rhodochrous dehalogenase. 



1 05. The mutant hydrolase of claim 1 04 wherein the substituted amino acid at 
the position corresponding to amino acid residue 106 is cysteine or 
glutamate. 

106. The mutant hydrolase of claim 97 wherein the substituted amino acid at a 
residue associated with activating a water molecule is not methionine, 
aspartate, or alanine. 

5 107. A method for preparing a compound of formula R-Linker-A-X 

comprising coupling a compound of formula R-Y with a compound of 
formula Z-Linker-A-X, wherein Y and Z are groups that can react to link 
R-to-Linker-A-X. 



10 1 08. The method of claim 1 07 wherein R-Y is an activated ester of a 

compound of formula R and wherein Z is an amine suitable to react with the 
activated ester to form an amide bond. 

109. A method for preparing a compound of formula R-Linker-A-X wherein 
1 5 the Linker comprises an amide bond comprising coupling a corresponding 

activated ester with a corresponding amine to provide the compound of formula 
R-Linker-A-X. 
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SEQUENCE LISTING. 

<110> Wood, Keith V. 
5 Los, Georgyi V. 

Bulleit, Robert F. 
Klaubert , Dieter 
McDougall, Mark 
Zimprich, Chad 
10 Promega Corporation 

<12 0> Covalent Tethering of Functional Groups to Proteins 

<130> 341.020WO1 

15 

<150> US 60/444,094 
<151> 2003-01-31 

<150> US 60/474,659 
20<151> 2003-05-30 

<160> 64 

<170> FastSEQ for Windows Version 4.0 

25 

<210> 1 
<211> 31 
<212> DNA 

<213> Rhodococcus rhodochrous 

30 

<400> 1 

gcttcacttg tcgtcatcgt ccttgtagtc a 31 

<210> 2 
35<211> 31 
<212> DNA 

<213> Rhodococcus rhodochrous 



<400> 2 

4 0gcttcacttg tcgtcatcgt ccttgtagtc a 
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2 

<210> 3 
<211> 27 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> A synthetic primer 
<400> 3 

lOccgggattgt tctacctcca ggaagac 

<210> 4 
<211> 27 
<212> DNA 
15<213> Artificial Sequence 

<220> 

<22 3> A synthetic primer 

20<400> 4 

ccgggattgg cctacctcca ggaagac 

<210> 5 
25<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

30<223> A synthetic primer 
<4O0> 5 

ccgggattgc agtacctcca ggaagac 

35<210> 6 
<211> 27 
<212> DNA 

<213> Artificial Sequence 

40<220> 

<223> A synthetic primer 
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3 

<400> 6 

ccgggattgg gctacctcca ggaagac 

<210> 7 
5<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

10<223> A synthetic primer 
<400> 7 

acgcgtcgac gccgccatgt cagaaatcgg tacaggc 

15<210> 8 
<211> 55 
<212> DNA 

<213> Artificial Sequence 

20<220> 

<223> A synthetic primer 

<400> 8 

ataagaatgc ggccgctcaa gcgcttcaac cggtgagtgc ggggagccag cgcgc 

25 

<210> 9 
<211> 35 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> A synthetic oligonucleotide 
<400> 9 

35ccggtgacta caaggacgat gacgacaagt gaagc 

<210> 10 
<211> 31 
<212> DNA 
40<213> Artificial Sequence 
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4 

<220> 

<223> A synthetic oligonucleotide 
<400> 10 . 

Sgcttcacttg tcgtcatcgt ccttgtagtc a 

<210> 11 
<211> 31 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

15<400> 11 

gcttcacttg tcgtcatcgt ccttgtagtc a 

<210> 12 
<211> 31 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

25 

<400> 12 

gcttcacttg tcgtcatcgt ccttgtagtc a 

<210> 13 
30<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

35<223> A synthetic primer 
<400> 13 

cttgggtttg gaagaggtcg tcctggtcat ccactgctgg ggc 
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5 

<210> 14 
<211> 42 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> A synthetic primer 
<400> 14 

lOtgagccccag cagtggatga ccaggacgac ctcttccaaa cc 

<210> 15 

<211> 23 

<212> DNA 

15<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

20<400> 15 

ggaatgggcc ctctagagcg acgatgtca 

<210> 16 
<211> 26 
25<212> DNA 

<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

30 

<400> 16 

cagtcagtca cgatggatcc gctcaa 
<210> 17 

35 

<400> 17 
000 

<210> 18 

40 

<400> 18 
000 
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6 

<210> 19 
<211> 5 
<212> PRT 

<213> Artificial Sequence 

5 

<220> 

<223> A synthetic affinity molecule 

<400> 19 
lOHis His His His His 
1 5 

<210> 20 
<211> 6 
15<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A synthetic affinity molecule 

20 

<400> 20 

His His His His His His 
1 5 

25<210> 21 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
30<220> 

<223> A synthetic affinity molecule 
<400> 21 

Glu Gin Lys Leu He Ser Glu Glu Asp Leu 
35 1 5 io 

<210> 22 
<211> 8 
<212> PRT 
40<213> Artificial Sequence 
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7 

<220> 

<223> A synthetic affinity molecule 
<400> 22 

5Asp Tyr Lys Asp Asp Asp Asp Lys 
1 5 

<210> 23 
<211> 8 
10<212> PRT 

<213> Artificial Sequence 

<220> 

<223> A synthetic affinity molecule 

15 

<400> 23 

Trp Ser His Pro Gin Phe Glu Lys 
1 5 

20<210> 24 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
25<220> 

<223> A synthetic affinity molecule 
<400> 24 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala 
30 1 5 

<210> 25 

<400> 25 
35 000 

<210> 26 
<211> 45 
<212> DNA 
40<213> Artificial Sequence 
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8 

<220> 

<223> A synthetic oligonucleotide 
<400> 26 

Satcgaaggtc gtgggatccc caggaattcc cgggtcgacg ccgcc 

<210> 27 
<211> 15 
<212> PRT 
10<213> Artificial Sequence 

<220> 

<223> A synthetic peptide 
15<400> 27 

lie Glu Gly Arg Gly He Pro Arg Asn Ser Arg Val Asp Ala Ala 
1 5 io 15 

<210> 28 
20<211> 51 
<212> DNA 

<213> Artificial Sequence 



<220> 

25<223> A synthetic oligonucleotide 
<400> 28 

tccggatcaa gcttgggcga cgaggtggac ggcgggccct ctagagccac c 

30<210> 29 
<211> 17 
<212> PRT 

<213> Artificial Sequence 

35<220> 

<223> A synthetic peptide 



<400> 29 

Ser Gly Ser Ser Leu Gly Asp Glu 
40 1 5 
Thr 



Val Asp Gly Gly Pro Ser Arg Ala 
10 is 
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9 

<210> 30 
<211> 45 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> A synthetic oligonucleotide 
<400> 30 

lOaccggttccg gatcaagctt gcggtaccgc gggccctcta gagcc 

<210> 31 
<211> 15 
<212> PRT 
15<213> Artificial Sequence 

<220> 

<223> A synthetic peptide 
20<400> 31 

Thr Gly Ser Gly Ser Ser Leu Arg Tyr Arg Gly Pro Ser Arg Ala 
15 10 15 

<210> 32 
25<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

3 0<223> A synthetic oligonucleotide 
<400> 32 

tccggatcaa gcttgcggta ccgcgggccc tctagagccg tcgacgccgc c 

35<210> 33 
<211> 17 
<212> PRT 

<213> Artificial Sequence 

40<220> 

<223> A synthetic peptide 
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10 

<400> 33 



Ser Gly Ser Ser Leu Arg Tyr Arg Gly Pro Ser Arg Ala Val Asp Ala 

1 
Ala 



1 5 10 15 



5 

<210> 34 
<211> 43 
<212> DNA 

<213> Artificial Sequence 

10 

<220> 

<223> A synthetic oligonucleotide 
<400> 34 

15cttgggtttg gaagaggtcg tcctggtcat ccaccagtgg ggc 

<210> 35 
<211> 42 
<212> DNA 
20<213> Artificial Sequence 

<220> 

<223> A synthetic oligonucleotide 
25<400> 35 

tgagccccac tggtggatga ccaggacgac ctcttccaaa cc 

<210> 36 
<211> 1053 
30<212> DNA 

<213> Staphylococcus aureus 



43 



42 



<400> 36 

agcttactat gccattatta ataacttagc catttcaaca ccttctttca aatatttata 60 

35ataaactatt gacaccgata ttacaattgt aatattattg atttataaaa attacaactg 120 

taatatcgga gggtttattt tgaaaaagtt aatattttta attgtaattg ctttagtttt 180 

aagtgcatgt aattcaaaca gttcacatgc caaagagtta aatgatttag aaaaaaaata 240 

taatgctcat attggtgttt atgctttaga tactaaaagt ggtaaggaag taaaatttaa 300 

ttcagataag agatttgcct atgcttcaac ttcaaaagcg ataaatagtg ctattttgtt 360 

40agaacaagta ccttataata agttaaataa aaaagtacat attaacaaag atgatatagt 420 

tgcttattct cctattttag aaaaatatgt aggaaaagat atcactttaa aagcacttat 480 

tgaggcttca atgacatata gtgataatac agcaaacaat aaaattataa aagaaatcgg 540 



WO 2004/072232 



PCT/US2004/002607 



11 

tggaatcaaa aaagttaaac aacgtctaaa agaactagga gataaagtaa caaatccagt 600 
tagatatgag atagaattaa attactattc accaaagagc aaaaaagata cttcaacacc 660 
tgctgccttc ggtaagaccc ttaataaact tatcgccaat ggaaaattaa gcaaagaaaa 720 
caaaaaattc ttacttgatt taatgttaaa taataaaagc ggagatactt taattaaaga 780 
Scggtgttcca aaagactata aggttgctga taaaagtggt caagcaataa catatgcttc 840 
tagaaatgat gttgcttttg tttatcctaa gggccaatct gaacctattg ttttagtcat 900 
ttttacgaat aaagacaata aaagtgataa gccaaatgat aagttgataa gtgaaaccgc 960 
caagagtgta atgaaggaat tttaatattc taaatgcata ataaatactg ataacatctt 1020 
atattttgta ttatattttg tattatcgtt gac 1053 

10 

<210> 37 
<211> 81 
<212> DNA 

<213> Artificial Sequence 

15 

<220> 

<223> A synthetic oligonucleotide 



<400> 37 

20ccggtgatcc aaaaaagaag agaaaggtag atccaaaaaa gaagagaaag gtagatccaa 60 
aaaagaagag aaaggtatga g 81 



<210> 38 
<211> 81 
25<212> DNA 

<213> Artificial Sequence 



<220> 

<223> A synthetic oligonucleotide 

i 

<400> 38 

gatcctcata cctttctctt cttttttgga tctacctttc tcttcttttt tggatctacc 60 
tttctcttct tttttggatc a 81 



35<210> 39 
<211> 20 
<212> DNA 
<213> Artificial 



Sequence 



40<220> 

<223> A synthetic oligonucleotide 
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12 

<400> 39 

attatgctga gtgatatccc 

<210> 40 
5<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

10<223> A synthetic oligonucleotide 
<400> 40 

ctcggtacca agctccttgt agtca 

15<210> 41 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
20<220> 

<223> A synthetic oligonucleotide 
<400> 41 

caaaggtatt gcatgtatgc agttcatccg gcctatcccg 

25 

<210> 42 
<211> 44 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> A synthetic oligonucleotide 
<400> 42 

35gtcaaaggta ttgcatgtat gctgttcatc cggcctatcc cgac 

<210> 43 
<211> 36 
<212> DNA 
40<213> Artificial Sequence 
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<220> 

<223> A synthetic oligonucleotide 
<400> 43 

Saggtattgca tgtatggcgt tcatccggcc tatccc 

<210> 44 
<211> 23 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

15<400> 44 

gggctggcaa gccacgtttg gtg 

<210> 45 
<211> 10 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> A synthetic improved Kozak sequence 

25 

<400> 45 
gccaccatgg 

<210> 46 
30<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

35<223> A synthetic oligonucleotide 
<220> 

<221> misc_feature 
<222> 1-36 
40<223> n = A, T, G, or C 
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23 
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14 

<400> 46 

nnnngctagc cagctggcga tatcgccacc atggga 

<210> 47 
5<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

10<223> A synthetic oligonucleotide 
<220> 

<221> misc_feature 
<222> 1-34 
15<223> n = A, T, G, or C 

<400> 47 

taatagttaa ttaagtaagc ggccgcnnnn 

20<210> 48 
<211> 296 
<212> PRT 

<213> Artificial Sequence 

25<220> 

<223> A synthetic peptide 



<400> 48 

Met Gly Ser Glu lie Gly Thr Gly Phe Pro Phe Asp Pro His Tyr Val 

Glu Val Leu Gly Glu Arg Met His Tyr Val Asp Val Gly Pro Arg Asp 

20 25 30 

Gly Thr Pro Val Leu Phe Leu His Gly Asn Pro Thr Ser Ser Tyr Leu 
35 40 45 

35Trp Arg Asn He He Pro His Val Ala Pro Ser His Arg Cys He Ala 
50 55 60 

Pro Asp Leu He Gly Met Gly Lys Ser Asp Lys Pro Asp Leu Asp Tyr 
65 70 75 " 80 

Phe Phe Asp Asp His Val Arg Tyr Leu Asp Ala Phe He Glu Ala Leu 

Gly Leu Glu Glu Val Val Leu Val He His Asp Trp Gly Ser Ala Leu 
100 105 110 
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15 

Gly Phe His Trp Ala Lys Arg Asn Pro Glu Arg Val Lys Gly lie Ala 

115 120 125 

Cys Met Glu Phe He Arg Pro He Pro Thr Trp Asp Glu Trp Pro Glu 
130 135 140 

5Phe Ala Arg Glu Thr Phe Gin Ala Phe Arg Thr Ala Asp Val Gly Arg 
145 150 155 160 

Glu Leu He He Asp Gin Asn Ala Phe He Glu Gly Ala Leu Pro Lys 

165 170 175 

Cys Val Val Arg Pro Leu Thr Glu Val Glu Met Asp His Tyr Arg Glu 
10 180 185 190 

Pro Phe Leu Lys Pro Val Asp Arg Glu Pro Leu Trp Arg Phe Pro Asn 

195 200 205 

Glu Leu Pro He Ala Gly Glu Pro Ala Asn He Val Ala Leu Val Glu 
210 215 220 

ISAla Tyr Met Asn Trp Leu His Gin Ser Pro Val Pro Lys Leu Leu Phe 
225 230 235 240 

Trp Gly Thr Pro Gly Val Leu He Pro Pro Ala Glu Ala Ala Arg Leu 

245 250 255 

Ala Glu Ser Leu Pro Asn Cys Lys Thr Val Asp He Gly Pro Gly Leu 
20 260 265 270 

Phe Tyr Leu Gin Glu Asp Asn Pro Asp Leu He Gly Ser Glu He Ala 

275 280 285 

Arg Trp Leu Pro Gly Leu Ala Gly 
290 295 

25 

<210> 49 
<211> 948 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> A synthetic oligonucleotide 
<220> 

35<221> misc_f eature 
<222> 1-948 

<223> n = A, C, T, or G 
<400> 49 

40nnnngctagc cagctggcga tatcgccacc atgggatccg agattgggac agggtttcct 60 
tttgatcctc attatgtgga ggtgctgggg gagagaatgc attatgtgga tgtggggcct 120 
agagatggga cacctgtgct gtttctgcat gggaatccta catcttctta tctgtggaga 180 
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aatattattc ctcatgtggc tccttctcat agatgtattg ctcctgatct gattgggatg 24 0 

gggaagtctg ataagcctga tctggattat ttttttgatg atcatgtgag atatctggat 300 
gcttttattg aggctctggg gctggaggag gtggtgctgg tgattcatga ttgggggtct . 360 
gctctggggt ttcattgggc taagagaaat cctgagagag tgaaggggat tgcttgtatg 
5gagtttatta gacctattcc tacatgggat gagtggcctg agtttgctag agagacattt 
caggctttta gaacagctga tgtggggaga gagctgatta ttgatcagaa tgcttttatt 
gagggggctc tgcctaagtg tgtggtgaga cctctgacag aggtggagat ggatcattat 
agagagcctt ttctgaagcc tgtggataga gagcctctgt ggagatttcc taatgagctg 660 
cctattgctg gggagcctgc taatattgtg gctctggtgg aggcttatat gaattggctg 72 0 
lOcatcagtctc ctgtgcctaa gctgctgttt tgggggacac ctggggtgct gattcctcct 
gctgaggctg ctagactggc tgagtctctg cctaattgta agacagtgga tattgggcct 
gggctgtttt atctgcagga ggataatcct gatctgattg ggtctgagat tgctagatgg 
ctgcccgggc tggccggcta atagttaatt aagtaagcgg ccgcnnnn 



15<210> 50 
<211> 951 
<212> DNA 

<213> Artificial Sequence 
20<220> 

<223> A synthetic oligonucleotide 
<220> 

<221> misc_feature 
25<222> 1-951 

<223> h = A, T, G, or C 

<400> 50 



420 
480 
540 
600 



780 
840 
900 
948 



nnnngctagc cagctggcgc ggatatcgcc accatgggat ccgagattgg gacagggttc 
30ccttttgatc ctcactatgt tgaagtgctg ggggaaagaa tgcactacgt ggatgtgggg 
cctagagatg ggaccccagt gctgttcctc cacgggaacc ctacatctag ctacctgtgg 
agaaatatta tacctcatgt tgctcctagt cataggtgca ttgctcctga tctgatcggg 240 
atggggaagt ctgataagcc tgacttagac tacttttttg atgatcatgt tcgatacttg 300 
gatgctttca ttgaggctct ggggctggag gaggtggtgc tggtgataca cgactggggg 
35tctgctctgg ggtttcactg ggctaaaagg aatccggaga gagtgaaggg gattgcttgc 
atggagttta ttcgacctat tcctacttgg gatgaatggc cagagtttgc cagagagaca 
tttcaagcct ttagaactgc cgatgtgggc agggagctga ttatagacca gaatgctttc 
atcgaggggg ctctgcctaa atgtgtagtc agacctctca ctgaagtaga gatggaccat 
tatagagagc cctttctgaa gcctgtggat cgcgagcctc tgtggaggtt tccaaatgag 
40ctgcctattg ctggggagcc tgctaatatt gtggctctgg tggaagccta tatgaactgg 
ctgcatcaga gtccagtgcc caagctactc ttttggggga ctccgggagt tctgattcct 



60 
120 
180 



360 
420 
480 
540 
600 
660 
72 0 



780 



cctgccgagg ctgctagact ggctgaatcc ctgcccaatt gtaagaccgt ggacatcggc 840 
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cctgggctgt tttacctcca agaggacaac cctgatctca tcgggtctga gatcgcacgg 900 
tggctgcccg ggctggccgg ctaatagtta attaagtagg cggccgcnnn n 951 

<210> 51 
5<211> 882 
<212> DNA 

<213> Rhodococcus rhodochrous 
<400> 51 

lOatgtcagaaa tcggtacagg cttccccttc gacccccatt atgtggaagt cctgggcgag 60 
cgtatgcact acgtcgatgt tggaccgcgg gatggcacgc ctgtgctgtt cctgcacggt 120 
aacccgacct cgtcctacct gtggcgcaac atcatcccgc atgtagcacc gagtcatcgg 180 
tgcattgctc cagacctgat cgggatggga aaatcggaca aaccagacct cgattatttc 240 
ttcgacgacc acgtccgcta cctcgatgcc ttcatcgaag ccttgggttt ggaagaggtc 300 

15gtcctggtca tccacgactg gggctcagct ctcggattcc actgggccaa gcgcaatccg 360 
gaacgggtca aaggtattgc atgtatggaa ttcatccggc ctatcccgac gtgggacgaa 420 
tggccggaat tcgcccgtga gaccttccag gccttccgga ccgccgacgt cggccgagag 480 
ttgatcatcg atcagaacgc tttcatcgag ggtgcgctcc cgaaatgcgt cgtccgtccg 540 
cttacggagg tcgagatgga ccactatcgc gagcccttcc tcaagcctgt tgaccgagag 600 

20ccactgtggc gattccccaa cgagctgccc atcgccggtg agcccgcgaa catcgtcgcg 660 
ctcgtcgagg catacatgaa ctggctgcac cagtcacctg tcccgaagtt gttgttctgg 720 
ggcacacccg gcgtactgat ccccccggcc gaagccgcga gacttgccga aagcctcccc 780 
aactgcaaga cagtggacat cggcccggga ttgcactacc tccaggaaga caacccggac 840 
cttatcggca gtgagatcgc gcgctggctc cccgcactct ag 882 

25 

<210> 52 
<211> 43 
<212> DNA 

<213> Artificial Sequence 

30 

<220> 

<223> A synthetic oligonucleotide 
<400> 52 

35cttgggtttg gaagaggtcg tcctggtcat ccacgaatgg ggc 43 

<210> 53 
<211> 42 
<212> DNA 
40<213> Artificial Sequence 
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<220> 

<223> A synthetic oligonucleotide 
<400> 53 

Stgagccccat tcgtggatga ccaggacgac ctcttccaaa cc 

<210> 54 
<211> 43 
<212> DNA 
10<213> Artificial Sequence 

<220> 

<223> A synthetic oligonucleotide 
15<400> 54 

cttgggtttg gaagaggtcg tcctggtcat ccactactgg ggc 

<210> 55 
<211> 42 
20<212> DNA 

<213> Artificial Sequence 

<220> 

<223> A synthetic oligonucleotide 

25 

<400> 55 

tgagccccag tagtggatga ccaggacgac ctcttccaaa cc 

<210> 56 
30<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

35<223> A synthetic primer 
<400> 56 

ccagttagat atgacataga attaaattac tattcacc 



40 
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<210> 57 
<211> 38 
<212> DNA 

<213> Artificial Sequence 

5 

<220> 

<223> A synthetic primer 
<400> 57 

lOggtgaatagt aatttaattc tatgtcatat ctaactgg 

<210> 58 

<211> 38 

<212> DNA 

15<213> Artificial Sequence 

<220> 

<223> A synthetic primer 
20<400> 58 

ccagttagat atgagataga attacagtac tattcacc 

<210> 59 
<211> 38 
25<212> DNA 

<213> Artificial Sequence 

<220> 

<223> A synthetic primer 

30 

<400> 59 

ggtgaatagt actgtaattc tatctcatat ctaactgg 

<210> 60 
35<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

40<223> A synthetic primer 
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<400> 60 

ccagttagat atgacataga attacagtac tattcacc 

<210> 61 
5<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

10<223> A synthetic primer 
<400> 61 

ggtgaatagt actgtaattc tatgtcatat ctaactgg 

15 

<210> 62 
<211> 40 
<212> DNA 

<213> Artificial Sequence 

20 

<220> 

<223> A synthetic primer 
<400> 62 

25caacaggtcg acgccgccat gaaagagtta aatgatttag 

<210> 63 . 
<211> 36 
<212> DNA 
30<213> Artificial Sequence 

<220> 

<223> A synthetic primer 
35<400> 63 

gtagtcaccg gtaaattcct tcattacact cttggc 

<210> 64 
<211> 4 
40<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> A synthetic peptide 



<400> 64 
5Asp Glu Val Asp 
1 
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